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Preface 



This book is motivated by the following technological developments: high quality 
integrated sensors and actuators, powerful control processors that can implement 
complex control algorithms, and powerful computer hardware and software that can 
be used to design and analyze control systems. We believe that these technological 
developments have the following ramifications for linear controller design: 

• When many high quality sensors and actuators are incorporated into the de- 
sign of a system, sophisticated control algorithms can outperform the simple 
control algorithms that have sufficed in the past. 

• Current methods of computer-aided control system design underutilize avail- 
able computing power and need to be rethought. 

This book is one small step in the directions suggested by these ramifications. 



We have several goals in writing this text: 

• To give a clear description of how we might formulate the linear controller 
design problem, without regard for how we may actually solve it, modeling 
fundamental specifications as opposed to specifications that are artifacts of a 
particular method used to solve the design problem. 

• To show that a wide (but incomplete) class of linear controller design problems 
can be cast as convex optimization problems. 

• To argue that solving the controller design problems in this restricted class is 
in some sense fundamentally tractable: although it involves more computing 
than the standard methods that have "analytical" solutions, it involves much 
less computing than a global parameter search. This provides a partial answer 
to the question of how to use available computing power to design controllers. 

IX 
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• To emphasize an aspect of linear controller design that has not been empha- 
sized in the past: the determination of limits of performance, i.e., specifica- 
tions that cannot be achieved with a given system and control configuration. 

It is not our goal to survey recently developed techniques of linear controller design, 
or to (directly) teach the reader how to design linear controllers; several existing 
texts do a good job of that. On the other hand, a clear formulation of the linear 
controller design problem, and an understanding that many of the performance 
limits of a linear control system can be computed, are useful to the practicing 
control engineer. 

Our intended audience includes the sophisticated industrial control engineer, and 
researchers and research students in control engineering. 

We assume the reader has a basic knowledge of linear systems (Kailath [Kai80], 
Chen [Che84], Zadeh and Desoer [ZD63]). Although it is not a prerequisite, the 
reader will benefit from a prior exposure to linear control systems, from both the 
"classical" and "modern" or state-space points of view. By classical control we refer 
to topics such as root locus, Bode plots, PI and lead-lag controllers (Ogata [Oga90], 
Franklin, Powell, Emami [FPE86]). By state-space control we mean the the- 
ory and use of the linear quadratic regulator (LQR), Kalman filter, and linear 
quadratic Gaussian (LQG) controller (Anderson and Moore [AM90], Kwakernaak 
and Sivan [KS72], Bryson and Ho [BH75]). 

We have tried to maintain an informal, rather than completely rigorous, approach 
to the mathematics in this book. For example, in chapter 13 we consider linear 
functionals on infinite- dimensional spaces, but we do not use the term dual space, 
and we avoid any discussion of their continuity properties. We have given proofs and 
derivations only when they are simple and instructive. The references we cite con- 
tain precise statements, careful derivations, more general formulations, and proofs. 
We have adopted this approach because we believe that many of the basic ideas 
are accessible to those without a strong mathematics background, and those with the 
background can supply the necessary qualifications, guess various generalizations, 
or recognize terms that we have not used. 

A Notes and References section appears at the end of each chapter. We have 
not attempted to give a complete bibliography; rather, we have cited a few key 
references for each topic. We apologize to the many researchers and authors whose 
relevant work (especially, work in languages other than English) we have not cited. 
The reader who wishes to compile a more complete set of references can start by 
computing the transitive closure of ours, i.e., our references along with the references 
in our references, and so on. 
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Chapter 1 

Control Engineering and 
Controller Design 



Controller design, the topic of this book, is only a part of the broader task of 
control engineering. In this chapter we first give a brief overview of control 
engineering, with the goal of describing the context of controller design. We then 
give a general discussion of the goals of controller design, and finally an outline 
of this book. 



1.1 Overview of Control Engineering 

The goal of control engineering is to improve, or in some cases enable, the perfor- 
mance of a system by the addition of sensors, control processors, and actuators. The 
sensors measure or sense various signals in the system and operator commands; the 
control processors process the sensed signals and drive the actuators, which affect 
the behavior of the system. A schematic diagram of a general control system is 
shown in figure 1.1. 

This general diagram can represent a wide variety of control systems. The sys- 
tem to be controlled might be an aircraft, a large electric power generation and 
distribution system, an industrial process, a head positioner for a computer disk 
drive, a data network, or an economic system. The signals might be transmitted 
via analog or digitally encoded electrical signals, mechanical linkages, or pneumatic 
or hydraulic lines. Similarly the control processor or processors could be mechanical, 
pneumatic, hydraulic, analog electrical, general-purpose or custom digital comput- 
ers. 

Because the sensor signals can affect the system to be controlled (via the con- 
trol processor and the actuators), the control system shown in figure 1.1 is called 
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Figure 1.1 A schematic diagram of a general control system. 



a feedback or closed-loop control system, which refers to the signal "loop" that cir- 
culates clockwise in this figure. In contrast, a control system that has no sensors, 
and therefore generates the actuator signals from the command signals alone, is 
sometimes called an open-loop control system. Similarly, a control system that has 
no actuators, and produces only operator display signals by processing the sensor 
signals, is sometimes called a monitoring system. 

In industrial settings, it is often the case that the sensor, actuator, and processor 
signals are boolean, i.e. assume only two values. Boolean sensors include mechan- 
ical and thermal limit switches, proximity switches, thermostats, and pushbutton 
switches for operator commands. Actuators that are often configured as boolean 
devices include heaters, motors, pumps, valves, solenoids, alarms, and indicator 
lamps. Boolean control processors, referred to as logic controllers, include indus- 
trial relay systems, general-purpose microprocessors, and commercial programmable 
logic controllers. 

In this book, we consider control systems in which the sensor, actuator, and 
processor signals assume real values, or at least digital representations of real values. 
Many control systems include both types of signals: the real- valued signals that we 
will consider, and boolean signals, such as fault or limit alarms and manual override 
switches, that we will not consider. 
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In control systems that use digital computers as control processors, the signals 
are sampled at regular intervals, which may differ for different signals. In some cases 
these intervals are short enough that the sampled signals are good approximations 
of the continuous signals, but in many cases the effects of this sampling must be 
considered in the design of the control system. In this book, we consider control 
systems in which all signals are continuous functions of time. 

In the next few subsections we briefly describe some of the important tasks that 
make up control engineering. 

1.1.1 System Design and Control Configuration 

Control configuration is the selection and placement of the actuators and sensors on 
the system to be controlled, and is an aspect of system design that is very important 
to the control engineer. Ideally, a control engineer should be involved in the design of 
the system itself, even before the control configuration. Usually, however, this is not 
the case: the control engineer is provided with an already designed system and starts 
with the control configuration. Many aircraft, for example, are designed to operate 
without a control system; the control system is intended to improve the performance 
(indeed, such control systems are sometimes called stability augmentation systems, 
emphasizing the secondary role of the control system). 

Actuator Selection and Placement 

The control engineer must decide the type and placement of the actuators. In 
an industrial process system, for example, the engineer must decide where to put 
actuators such as pumps, heaters, and valves. The specific actuator hardware (or 
at least, its relevant characteristics) must also be chosen. Relevant characteristics 
include cost, power limit or authority, speed of response, and accuracy of response. 
One such choice might be between a crude, powerful pump that is slow to respond, 
and a more accurate but less powerful pump that is faster to respond. 

Sensor Selection and Placement 

The control engineer must also decide which signals in the system will be measured 
or sensed, and with what sensor hardware. In an industrial process, for example, 
the control engineer might decide which temperatures, flow rates, pressures, and 
concentrations to sense. For a mechanical system, it may be possible to choose 
where a sensor should be placed, e.g., where an accelerometer is to be positioned on 
an aircraft, or where a strain gauge is placed along a beam. The control engineer 
may decide the particular type or relevant characteristics of the sensors to be used, 
including the type of transducer, and the signal conditioning and data acquisition 
hardware. For example, to measure the angle of a shaft, sensor choices include 
a potentiometer, a rotary variable differential transformer, or an 8-bit or 12-bit 
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absolute or differential shaft encoder. In many cases, sensors are smaller than 
actuators, so a change of sensor hardware is a less dramatic revision of the system 
design than a change of actuator hardware. 

There is not yet a well-developed theory of actuator and sensor selection and 
placement, possibly because it is difficult to precisely formulate the problems, and 
possibly because the problems are so dependent on available technology. Engineers 
use experience, simulation, and trial and error to guide actuator and sensor selection 
and placement. 

1.1.2 Modeling 

The engineer develops mathematical models of 

• the system to be controlled, 

• noises or disturbances that may act on the system, 

• the commands the operator may issue, 

• desirable or required qualities of the final system. 

These models might be deterministic (e.g., ordinary differential equations (ODE's), 
partial differential equations (PDE's), or transfer functions), or stochastic or prob- 
abilistic (e.g., power spectral densities). 

Models are developed in several ways. Physical modeling consists of applying 
various laws of physics (e.g., Newton's equations, energy conservation, or flow bal- 
ance) to derive ODE or PDE models. Empirical modeling or identification consists 
of developing models from observed or collected data. The a priori assumptions used 
in empirical modeling can vary from weak to strong: in a "black box" approach, 
only a few basic assumptions are made, for example, linearity and time-invariance 
of the system, whereas in a physical model identification approach, a physical model 
structure is assumed, and the observed or collected data is used to determine good 
values for these parameters. Mathematical models of a system are often built up 
from models of subsystems, which may have been developed using different types 
of modeling. 

Often, several models are developed, varying in complexity and fidelity. A simple 
model might capture some of the basic features and characteristics of the system, 
noises, or commands; a simple model can simplify the design, simulation, or anal- 
ysis of the control system, at the risk of inaccuracy. A complex model could be 
very detailed and describe the system accurately, but a complex model can greatly 
complicate the design, simulation, or analysis of the system. 
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1.1.3 Controller Design 

Controller design is the topic of this book. The controller or control law describes 
the algorithm or signal processing used by the control processor to generate the 
actuator signals from the sensor and command signals it receives. 

Controllers vary widely in complexity and effectiveness. Simple controllers in- 
clude the proportional (P), the proportional plus derivative (PD), the proportional 
plus integral (PI), and the proportional plus integral plus derivative (PID) con- 
trollers, which are widely and effectively used in many industries. More sophisti- 
cated controllers include the linear quadratic regulator (LQR), the estimated-state- 
feedback controller, and the linear quadratic Gaussian (LQG) controller. These 
sophisticated controllers were first used in state-of-the-art aerospace systems, but 
are only recently being introduced in significant numbers. 

Controllers are designed by many methods. Simple P or PI controllers have only 
a few parameters to specify, and these parameters might be adjusted empirically, 
while the control system is operating, using "tuning rules". A controller design 
method developed in the 1930's through the 1950's, often called classical controller 
design, is based on the 1930's work on the design of vacuum tube feedback am- 
plifiers. With these heuristic (but very often successful) techniques, the designer 
attempts to synthesize a compensation network or controller with which the closed- 
loop system performs well (the terms "synthesize", "compensation", and "network" 
were borrowed from amplifier circuit design). 

In the 1960's through the present time, state-space or "modern" controller de- 
sign methods have been developed. These methods are based on the fact that the 
solutions to some optimal control problems can be expressed in the form of a feed- 
back law or controller, and the development of efficient computer methods to solve 
these optimal control problems. 

Over the same time period, researchers and control engineers have developed 
methods of controller design that are based on extensive computing, for example, 
numerical optimization. This book is about one such method. 



1.1.4 Controller Implementation 

The signal processing algorithm specified by the controller is implemented on the 
control processor. Commercially available control processors are generally restricted 
to logic control and specific types of control laws such as PID. Custom control pro- 
cessors built from general-purpose microprocessors or analog circuitry can imple- 
ment a very wide variety of control laws. General-purpose digital signal processing 
(DSP) chips are often used in control processors that implement complex control 
laws. Special-purpose chips designed specifically for control processors are also now 
available. 
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1.1.5 Control System Testing, Validation, and Tuning 

Control system testing may involve: 

• extensive computer simulations with a complex, detailed mathematical model, 

• real-time simulation of the system with the actual control processor operating 
("hardware in the loop"), 

• real-time simulation of the control processor, connected to the actual system 
to be controlled, 

• field tests of the control system. 

Often the controller is modified after installation to optimize the actual perfor- 
mance, a process known as tuning. 

1.2 Goals of Controller Design 

A well designed control system will have desirable performance. Moreover, a well 
designed control system will be tolerant of imperfections in the model or changes 
that occur in the system. This important quality of a control system is called 
robustness. 

1.2.1 Performance Specifications 

Performance specifications describe how the closed-loop system should perform. 
Examples of performance specifications are: 

• Good regulation against disturbances. The disturbances or noises that act on 
the system should have little effect on some critical variables in the system. 
For example, an aircraft may be required to maintain a constant bearing 
despite wind gusts, or the variations in the demand on a power generation and 
distribution system must not cause excessive variation in the line frequency. 
The ability of a control system to attenuate the effects of disturbances on 
some system variables is called regulation. 

• Desirable responses to commands. Some variables in the system should re- 
spond in particular ways to command inputs. For example, a change in the 
commanded bearing in an aircraft control system should result in a change in 
the aircraft bearing that is sufficiently fast and smooth, yet does not exces- 
sively overshoot or oscillate. 

• Critical signals are not too big. Critical signals always include the actuator 
signals, and may include other signals in the system. In an industrial process 
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control system, for example, an actuator signal that goes to a pump must 
remain within the limits of the pump, and a critical pressure in the system 
must remain below a safe limit. 

Many of these specifications involve the notion that a signal (or its effect) is small; 
this is the subject of chapters 4 and 5. 

1.2.2 Robustness Specifications 

Robustness specifications limit the change in performance of the closed-loop system 
that can be caused by changes in the system to be controlled or differences between 
the system to be controlled and its model. Such perturbations of the system to be 
controlled include: 

• The characteristics of the system to be controlled may change, perhaps due 
to component drift, aging, or temperature coefficients. For example, the effi- 
ciency of a pump used in an industrial process control system may decrease, 
over its life time, to 70% of its original value. 

• The system to be controlled may have been inaccurately modeled or identified, 
possibly intentionally. For example, certain structural modes or nonlinearities 
may be ignored in an aircraft dynamics model. 

• Gross failures, such as a sensor or actuator failure, may occur. 
Robustness specifications can take several forms, for example: 

• Low differential sensitivities. The derivative of some closed-loop quantity, 
with respect to some system parameter, is small. For example, the response 
time of an aircraft bearing to a change in commanded bearing should not be 
very sensitive to aerodynamic pressure. 

• Guaranteed margins. The control system must have the ability to meet some 
performance specification despite some specific set of perturbations. For ex- 
ample, we may require that the industrial process control system mentioned 
above continue to have good regulation of product flow rate despite any de- 
crease in pump effectiveness down to 70%. 



1.2.3 Control Law Specifications 

In addition to the goals and specifications described above, there may be constraints 
on the control law itself. These control law specifications are often related to the 
implementation of the controller. Examples include: 

• The controller has a specific form, e.g., PID. 
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• The controller is linear and time-invariant (LTI). 

• In a control system with many sensors and actuators, we may require that 
each actuator signal depend on only one sensor signal. Such a controller is 
called decentralized, and can be implemented using many noncommunicating 
control processors. 

• The controller must be implemented using a particular control processor. This 
specification limits the complexity of the controller. 

1.2.4 The Controller Design Problem 

Once the system to be controlled has been designed and modeled, and the designer 
has identified a set of design goals (consisting of performance goals, robustness re- 
quirements, and control law constraints), we can pose the controller design problem: 

The controller design problem: Given a model of the system to be 
controlled (including its sensors and actuators) and a set of design goals, 
find a suitable controller, or determine that none exists. 

Controller design, like all engineering design, involves tradeoffs; by suitable, we 
mean a satisfactory compromise among the design goals. Some of the tradeoffs in 
controller design are intuitively obvious: e.g., in mechanical systems, it takes larger 
actuator signals (forces, torques) to have faster responses to command signals. Many 
other tradeoffs are not so obvious. 

In our description of the controller design problem, we have emphasized the 
determination of whether or not there is any controller that provides a suitable 
tradeoff among the goals. This aspect of the controller design problem can be as 
important in control engineering as finding or synthesizing an appropriate controller 
when one exists. If it can be determined that no controller can achieve a suitable 
tradeoff, the designer must: 

• relax the design goals, or 

• redesign the system to be controlled, for example by adding or relocating 
sensors or actuators. 

In practice, existing controller design methods are often successful at finding a 
suitable controller, when one exists. These methods depend upon talent, experience, 
and a bit of luck on the part of the control engineer. If the control engineer is suc- 
cessful and finds a suitable controller, then of course the controller design problem 
has been solved. However, if the control engineer fails to design a suitable con- 
troller, then he or she cannot be sure that there is no suitable controller, although 
the control engineer might suspect this. Another design approach or method (or 
indeed, control engineer) could find a suitable controller. 
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1.3 Control Engineering and Technology 

1.3.1 Some Advances in Technology 

Control engineering is driven by available technology, and the pace of the relevant 
technology advances is now rapid. In this section we mention a few of the advances 
in technology that currently have, or will have, an impact on control engineering. 
More specific details can be found in the Notes and References at the end of this 
chapter. 



Integrated and Intelligent Sensors 

Over the past decade the technology of integrated sensors has been developed. Inte- 
grated sensors are built using the techniques of microfabrication originally developed 
for integrated circuits; they often include the signal conditioning and interface cir- 
cuitry on the same chip, in which case they are called intelligent sensors. This 
signal conditioning might include, for example, temperature compensation. In- 
tegrated sensors promise greater reliability and linearity than many conventional 
sensors, and because they are typically cheaper and smaller than conventional sen- 
sors, it will be possible to incorporate many more sensors in the design of control 
systems than is currently done. 

Another example of a new sensor technology is the Global Positioning System 
(GPS). GPS position and velocity sensors will soon be available for use in control 
systems. 



Actuator Technology 

Significant improvements in actuator technology have been made. For example, 
direct-drive brushless DC motors are more linear and have higher bandwidths than 
the motors with brushes and gears (and stiction and backlash) that they will replace. 
As another example, the trend in aircraft design is towards many actuators, such 
as canards and vectored thrust propulsion systems. 



Digital Control Processors 

Over the last few decades, the increase in control processor power and simultane- 
ous decrease in cost has been phenomenal, especially for digital processors such as 
general-purpose microprocessors, digital signal processors, and special-purpose con- 
trol processors. As a result, the complexity of control laws that can be implemented 
has increased dramatically. In the future, custom or semicustom chips designed 
specifically for control processor applications will offer even more processing power. 



10 Chapter 1 Control Engineering and Controller Design 

Computer-Aided Control System Design and Analysis 

Over the past decade we have seen the rise of computer-aided control system de- 
sign (CACSD). Great advances in available computing power (e.g., the engineering 
workstation), together with powerful software, have automated or eased many of 
the tasks of control engineering: 

• Modeling. Sophisticated programs can generate finite element models or de- 
termine the kinematics and dynamics of a mechanical system from its physical 
description. Software that implements complex identification algorithms can 
process large amounts of experimental data to form models. Interactive and 
graphics driven software can be used to manipulate models and build models 
of large systems from models of subsystems. 

• Simulation. Complex models can be rapidly simulated. 

• Controller design. Enormous computing power is now available for the design 
of controllers. This last observation is of fundamental importance for this 
book. 

1.3.2 Challenges for Controller Design 

The technology advances described above present a number of challenges for con- 
troller design: 

• More sensors and actuators. For only a modest cost, it is possible to incor- 
porate many more sensors, and possibly more actuators, into the design of a 
system. Clearly the extra information coming from the sensors and the extra 
degrees of freedom in manipulating the system make better control system 
performance possible. The challenge for controller design is to take advantage 
of this extra information and degrees of freedom. 

• Higher quality systems. As higher quality sensors and actuators are incorpo- 
rated into the system, the system behavior becomes more repeatable and can 
be more accurately modeled. The challenge for controller design is to take 
advantage of this more detailed knowledge of the system. 

• More powerful control processors. Very complex control laws can be imple- 
mented using digital control processors. Clearly a more complex control law 
could improve control system performance (it could also degrade system per- 
formance, if improperly designed). The challenge for controller design is to 
fully utilize the control processor power to achieve better control system per- 
formance. 

In particular, control law specifications should be examined carefully. Histor- 
ically relevant measures of control law complexity, such as the order of an LTI 
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controller, are now less relevant. For example, the order of the compensator 
used in a vacuum tube feedback amplifier is the number of inductors and ca- 
pacitors needed to synthesize the compensation network, and was therefore 
related to cost, size, and reliability. On a particular digital control processor, 
however, the order of the controller is essentially unrelated to cost, size, and 
reliability. 

• Powerful computers to design controllers. The challenge for controller design 
is to productively use the enormous computing power available. Many current 
methods of computer-aided controller design simply automate procedures de- 
veloped in the 1930's through the 1950's, for example, plotting root loci or 
Bode plots. Even the "modern" state-space and frequency-domain methods 
(which require the solution of algebraic Riccati equations) greatly underutilize 
available computing power. 

1.4 Purpose of this Book 

The main purpose of this book is to describe how the controller design problem can 
be solved for a restricted set of systems and a restricted set of design specifications, 
by combining recent theoretical results with recently developed numerical convex 
optimization techniques. 

The restriction on the systems is that they must be linear and time-invariant 
(LTI). The restriction on the design specifications is that they be closed-loop convex, 
a term we shall describe in detail in chapter 6. This restricted set of design specifi- 
cations includes a wide class of performance specifications, a less complete class of 
robustness specifications, and essentially none of the control law specifications. 

The basic approach involves directly designing a good closed-loop response, as 
opposed to designing an open-loop controller that yields a good closed-loop response. 
We will show that a wide variety of important practical constraints on system 
performance can be formulated as convex constraints on the response of the closed- 
loop system. These are the specifications that we call closed-loop convex. 

Given a system that is LTI, and a set of closed-loop convex design specifica- 
tions, the controller design problem can be cast as a convex optimization problem, 
and consequently, can be effectively solved. This means that if the specifications are 
achievable, we can find a controller that meets the specifications; if the specifications 
are not achievable, this fact can be determined, i.e., we will know that the spec- 
ifications are not achievable. In contrast, the designer using a classical controller 
design scheme is only likely to find a controller that meets a set of specifications 
that is achievable; and, of course, certain not to find a controller that meets a set of 
specifications that is not achievable. Many controller design techniques do not have 
any way to determine unambiguously that a set of specifications is not achievable. 

For controller design problems of the restricted form, we shall show how to 
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determine which specifications can be achieved and which cannot, and therefore 
how the limits of performance can be determined for a given system and control 
configuration. 

No matter which controller design method is used by the engineer, knowledge 
of the achievable performance is extremely valuable practical information, since it 
provides an absolute yardstick against which any designed controller can be com- 
pared. To know that a certain candidate controller that is easily implemented, or 
has some other advantage, achieves regulation only 10% worse than the best reg- 
ulation achievable by any LTI controller, is a strong point in favor of the design. 
In this sense, this book is not about a particular controller design method or syn- 
thesis procedure; rather it is about a method of determining what specifications (of 
a large but restricted class) can be met using any controller design method, for a 
given system and control configuration. 

We have in addition several subsidiary goals, some of which we have already 
mentioned. The first is to develop a framework in which we can precisely formulate 
the controller design problem which we vaguely described above. Our experience 
suggests that carefully formulating a real controller design problem in the frame- 
work we develop will help identify the critical issues and design tradeoffs. This 
clarification is useful in practical controller design. 

We also hope to initiate a discussion of how we can apply the enormous com- 
puting power that will soon be available to the controller design problem, beyond, 
for example, solving the algebraic Riccati equations of "modern" controller design 
methods. In this book we start this discussion with a specific suggestion: solving 
convex nondifferentiable optimization problems. 

1.4.1 An Example 

We can demonstrate some of the main points of this book with an example. We will 
consider a specific system that has one actuator and one output that is supposed 
to track a command input, and is affected by some noises; the system is described 
in section 2.4 (and many other places throughout the book), but the details are not 
relevant for this example. 

Goals for the design of a controller for this system might be: 

• Good RMS regulation, i.e., the root-mean-square (RMS) value of the output, 
due to the noises, should be small. 

• Low RMS actuator effort, i.e., the RMS value of the actuator signal should 
be small. 

It is intuitively clear that by using a larger actuator signal, we may improve the 
regulation, since we can expend more effort counteracting the effect of the noises. 
We will see in chapter 12 that the exact nature of this tradeoff between RMS 
regulation and RMS actuator effort can be determined; it is shown in figure 1.2. 
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The shaded region shows every pair of RMS regulation and RMS actuator effort 
specifications that can be achieved by a controller; the designer must, of course, 
pick one of these. 
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Figure 1.2 The shaded region shows specifications on RMS actuator effort 
and RMS regulation that are achievable. The unshaded region, at the lower 
left, shows specifications that no controller can achieve: this region shows a 
fundamental limit of performance for this system. 



The unshaded region at the lower left is very important: it consists of RMS 
regulation and RMS actuator effort specifications that cannot be achieved by any 
controller, no matter which design method is used. This unshaded region therefore 
describes a fundamental limit of performance for this system. It tells us, for exam- 
ple, that if we require an RMS regulation of 0.05, then we cannot simultaneously 
achieve an RMS actuator effort of 0.05. 

Each shaded point in figure 1.2 represents a possible design; we can view many 
controller design methods as "rummaging around in the shaded region". If the 
designer knows that a point is shaded, then the designer can find a controller that 
achieves the corresponding specifications, if the designer is clever enough. On the 
other hand, each unshaded point represents a limit of performance for our system. 
Knowing that a point is unshaded is perhaps disappointing, but still very useful 
information for the designer. 

The reader may know that this tradeoff of RMS regulation against RMS actuator 
effort can be determined using LQG theory. The main point of this book is that 
for a much wider class of specifications, a similar tradeoff curve can be computed. 
Suppose, for example, that we add the following specification to our goals above: 
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• Command to output overshoot limit, i.e., the step response overshoot of the 
closed-loop system, from the command to the output, does not exceed 10%. 

Of course, intuition tells us that by adding this specification, we make the design 
problem "harder": certain RMS regulation and RMS actuator effort specifications 
that could be achieved without this new specification will no longer be achievable 
once we impose it. 

In this case there is no analytical theory, such as LQG, that shows us the exact 
tradeoff. The methods of this book, however, can be used to determine the exact 
tradeoff of RMS regulation versus RMS actuator effort with the overshoot limit 
imposed. This tradeoff is shown in figure 1.3. The dashed line, below the shaded 
region of achievable specifications, is the tradeoff boundary when the overshoot limit 
is not imposed. The "lost ground" represents the cost of imposing the overshoot 
limit. We can compute this new region because limits on RMS actuator effort, RMS 
regulation, and step response overshoot are all closed-loop convex specifications. 
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Figure 1.3 The shaded region shows specifications on RMS actuator effort 
and RMS regulation that are achievable when an additional limit of 10% 
step response overshoot is imposed; it can be computed using the methods 
described in this book. The dashed line shows the tradeoff boundary without 
the overshoot limit; the gap between this line and the shaded region shows 
the cost of imposing the overshoot limit. 



In contrast, suppose that instead of the overshoot limit, we impose the following 
control law constraint: 

• The controller is proportional plus derivative (PD), i.e., the control law has a 
specific form. 
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This constraint might be needed to implement the controller using a specific com- 
mercially available control processor. This specification is not closed-loop convex, so 
the methods described in this book cannot be used to determine the exact tradeoff 
between RMS actuator effort and RMS regulation. This tradeoff can be computed, 
however, using a brute force approach described in the Notes and References, and 
is shown in figure 1.4. The dashed line is the tradeoff boundary when the PD con- 
troller constraint is not imposed. Specifications on RMS actuator effort and RMS 
regulation that lie in the region between the dashed line and the shaded region can 
be achieved by some controller, but no PD controller. 
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Figure 1.4 The shaded region shows specifications on RMS actuator effort 
and RMS regulation that can be achieved using a PD controller; it cannot be 
computed using the methods described in this book. The dashed line shows 
the tradeoff boundary when no constraint on the control law is imposed. 



An important point of this book is that we can compute tradeoffs among closed- 
loop convex specifications, such as shown in figure 1.3, although it requires more 
computation than determining the tradeoff for a problem that has an analytical 
solution, such as shown in figure 1.2; in return, however, a much larger class of 
problems can be considered. While the computation needed to determine a tradeoff 
such as shown in figure 1.3 is more than that required to compute the tradeoff shown 
in figure 1.2, it is much less than the computation required to compute tradeoffs 
such as the one shown in figure 1.4. 

The fact that a tradeoff like the one shown in figure 1.4 is much harder to 
compute than a tradeoff like the one shown in figure 1.3 presents a paradox. To 
produce figure 1.3 we search over the set of all possible LTI controllers, which has 
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infinite dimension. To produce figure 1.4, however, we search over the set of all PD 
controllers, which has dimension two. We shall see that convexity makes figure 1.3 
"easier" to produce than figure 1.4, even though we must search over a far "larger" 
set of potential controllers. 

1.5 Book Outline 

In part I, A Framework for Controller Design, we develop a formal framework for 
many of the concepts described above: the system to be controlled, the control con- 
figuration, the controller, and the design goals and objectives for controller design. 

In part II, Analytical Tools, we first describe norms of signals and systems, which 
can be used to make precise such design goals as "error signals should be made small, 
while the actuator signals should not be too large". We then study some important 
geometric properties that many controller design specifications have, and introduce 
the important notion of a closed-loop convex design specification. 

In part III, Design Specifications, we catalog many closed-loop convex design 
specifications. These design specifications include specifications on the response of 
the closed-loop system to the various commands and disturbances that may act on 
it, as well as robustness specifications that limit the sensitivity of the closed-loop 
system to changes in the system to be controlled. 

In part IV, Numerical Methods, we describe numerical methods for solving the 
controller design problem. We start with some controller design problems that have 
analytic solutions, i.e., can be solved rapidly and exactly using standard methods. 
We then turn to the numerical solution of controller design problems that can be 
expressed in terms of closed-loop convex design specifications, but do not have 
analytic solutions. 

In the final chapter we give some discussion of the methods described in this 
book, as well as some history of the main ideas. 

1.5.1 Book Structure 

The structure of this book is shown in detail in figure 1.5. From this figure the 
reader can see that the structure of this book is more vertical than that of most 
books on linear controller design, which often have parallel discussions of different 
design techniques. In contrast, this book tells essentially one story, with a few 
chapters covering related subplots. 

A minimal path through the book, which conveys only the essentials of the story, 
consists of chapters 2, 3, 6, 8-10, and 15. This path results from following every 
dashed line labeled "experts can skip" in figure 1.5. We note, however, that the 
term "expert" depends on the context: for example, the reader may be an expert 
on norms (and thus can safely skip or skim chapters 4 and 5), but not on convex 
optimization (and thus should read chapters 13 and 14). 
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Notes and References 

A history of feedback control is given in Mayr [May70] and the book [Ben79] and arti- 
cle [Ben76] by Bennett. 

Sensors and Actuators 

Commercially available sensors and actuators for control systems are surveyed in the books 
by Hordeski [Hor.87] and DeSilva [DeS89]; the reader can also consult commercial catalogs 
and manuals such as [ECC80] and [Tra89b]. 

The technology behind integrated sensors and actuators is discussed in the survey article by 
Petersen [Pet82]. Commercial implications of integrated sensor technology are discussed 
in, e.g., [All80] (many of the predictions in this article have come to pass over the last 
decade). Research developments in integrated sensors and actuators can be found in 
the conference proceedings [Tra89a] (this conference occurs every other year), and the 
journal Sensors and Actuators, published by Elsevier Sequoia. The journal IEEE Trans, 
on Electron Devices occasionally has special issues on integrated sensors and actuators 
(e.g., Dec. 1979, Jan. 1982). 

Overviews of GPS can be found in the book compiled by Wells [Wel87] and the two 
volume set of reprints published by the Institute of Navigation [GPS84]. 

Modeling and Identification 

Formulation of dynamics equations for physical modeling of mechanical systems is covered 
in Kane and Levinson [KL85], Crandall et. al. [CKK68], and Cannon [Can67]. Texts 
treating identification include those by Box and Jenkins [BJ70], Norton [Nor.86], and 
Ljung [Lju87], which has a complete bibliography. 

Linear Controller Design 

P and PI controllers have been in use for a long time; for example, the advantage of a 
PI controller over a P controller is discussed in Maxwell's 1868 article [Max68], which is 
one of the first articles on controller design and analysis. PID tuning rules that have been 
widely used originally appeared in the 1942 article by Ziegler and Nichols [ZN42]. 

The book Theory of Servomechanisms [JNP47], edited by James, Nichols, and Philips, 
gives a survey of controller design right after World War II. The 1957 book by Newton, 
Gould, and Kaiser [NGK57] is among the first to adopt an "analytical" approach to 
controller design (see below). Texts covering classical linear controller design include 
Bode [Bod45], Ogata [Oga90], Horowitz [Hor.63], and Dorf [Dor88]. The root locus 
method was first described in [Eva50]. 

Recent books covering classical and state-space methods of linear controller design include 
Franklin, Powell, and Emami [FPE86] and Chen [Che87]. Linear quadratic methods 
for LTI controller design are covered in Athans and Falb [AF66, CH.9], Kwakernaak and 
Sivan [KS72], Anderson and Moore [AM90], and Bryson and Ho [BH75]. 

Three recent books on LTI controller design deserve special mention: Lunze's Robust Mul- 
tivariable Feedback Design [Lun89], Maciejowski's Multivariable Feedback Design [Mac89], 
and Vidyasagar's Control System Synthesis: A Factorization Approach [Vid85]. The first 
two, [Lun89] and [Mac89], cover a broad range of current topics and linear controller 
design techniques, although neither covers our central topic, convex closed-loop design. 
Compared to this book, these two books address more directly the question of how to 
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design linear controllers. Vidyasagar's book [Vid85] contains the "recent results" that we 
referred to at the beginning of section 1.4. Our book can be thought of as an extension or 
application of the ideas in [Vid85]. 

Digital Control 

Digital control systems are covered in the books by Ogata [Oga87], Ackermann [Ack85], 
and Astrom and Wittenmark [AW90]. A recent comprehensive text covering all aspects 
of digital control systems is by Franklin, Powell, and Workman [FPW90]. 

Control Processors and Controller Implementation 

Programmable logic controllers and other industrial control processors are covered in 
Warnock [War.88]. An example of a commercially available special-purpose chip for con- 
trol systems is National Semiconductor's LM628 precision motion controller [Pre89], which 
implements a PID control law. 

The use of DSP chips as control processors is discussed in several articles and manufac- 
turers' applications manuals. For example, the implementation of a simple controller on 
a Texas Instruments TMS32010 is described in [SB87], and the implementation of a PID 
controller on a Motorola DSP56001 is described in [SS89]. Chapter 12 of [FPW90] de- 
scribes the implementation of a complex disk drive head positioning controller using the 
Analog Devices ADSP2101. The article [Che82] describes the implementation of simple 
controllers on an Intel 2920. 

The design of custom integrated circuits for control processors is discussed in [JTP85] 
and [TL80]. 

General issues in controller implementation are discussed in the survey paper by Hansel- 
mann [Han87]. 

The book [AT90] discusses real-time software used to program general-purpose computers 
as control processors. Topics covered include implementing the control law, interface to 
actuators and sensors, communication, data logging, and operator display. 

Computers and Control Engineering 

Examples of computer-based equipment for control systems engineering include Hewlett- 
Packard's hp3563a Control Systems Analyzer [HeP89], which automates frequency re- 
sponse measurements and some simple identification procedures, and Integrated Systems' 
AC- 100 control processor [AC 100], which allows rapid implementation of a controller for 
prototyping. 

A review of various software packages for structural analysis is given in [Nik86], in particu- 
lar the chapter [Mac86]. A widely used finite element code is nastran [Cif89]. Computer 
software packages (based on Kane's method) that symbolically form the system dynamics 
include Sd/exact, described in [RS86, SR88] and AUTOLEV, described in [SL88]. Exam- 
ples of software for system identification are the system-id toolbox [Lju86] for use with 
matlab and the system-id package [mat88] for use with matrix-x (see below). 

Examples of controller design software are matlab [MLB87] (and [LL87]), matrix- 
x [SFL85, WSG84], delight-mimo [PSW85], and console [FWK89]. Some of these 
programs were originally based on the linear algebra software packages linpack [DMB79] 
and eispack [SBD76]. A new generation of reliable linear algebra routines is now being 
developed in the lapack project [Dem89]; lapack will take advantage of some of the 
advances in computer hardware, e.g., vector processing. 
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General discussion of CACSD can be found, for example, in the article [Ast83], and the 
special issue of the Proceedings of the IEEE [PIE84]. See also [JH85] and [Den84]. 

Determining Limits of Performance 

The value of being able to determine that a set of specifications cannot be achieved, and 
the failing of many controller design methods in this regard, has been noted before. In 
the 1957 book Analytical Design of Linear Feedback Controls, by Newton, Gould, and 
Kaiser [NGK57, §1.6], we find: 

Unfortunately, the trial and error design method is beset with certain fun- 
damental difficulties, which must be clearly understood and appreciated in 
order to employ it properly. From both a practical and theoretical viewpoint 
its principal disadvantage is that it cannot recognize an inconsistent set of 
specifications. 

. . . The analytical design procedure has several advantages over the trial and 
error method, the most important of which is the facility to detect immediately 
and surely an inconsistent set of specifications. The designer obtains a "yes" 
or "no" answer to the question of whether it is possible to fulfill any given 
set of specifications; he is not left with the haunting thought that if he had 
tried this or that form of compensation he might have been able to meet the 
specifications. 

. . . Even if the reader never employs the analytical procedure directly, the 
insight that it gives him into linear system design materially assists him in 
employing the trial and error design procedure. 

This book is about an analytical design procedure, in the sense in which the phrase is used 
in this quote. 

There are a few results in classical controller design that can be used to determine 
some specifications that cannot be achieved. The most famous is Bode's integral theo- 
rem [Bod45]; a more recent result is due to Zames [Zam81, ZF83]. These results were 
extended to unstable plants by Freudenberg and Looze [FL85, FL88], and plants with 
multiple sensors and actuators by Boyd and Desoer [BD85]. 

The article by Barratt and Boyd [BB89] gives some specific examples of using convex 
optimization to numerically determine the limits of performance of a simple control system. 
The article by Boyd, Barratt, and Norman [BBN90] gives an overview of the closed-loop 
convex design method. 

About the Example in Section 1.4.1 

The plant used is described in section 2.4; the process and sensor noises are described in 
chapter 11, and the precise definitions of RMS actuator effort, RMS regulation, and step 
response overshoot are given in chapters 3, 5, and 8. 

The method used to determine the shaded region in figure 1.2 is explained in section 12.2.1; 
a similar figure appears in Kwakernaak and Sivan [KS72, P205]. The method used to 
determine the shaded region in figure 1.3 is explained in detail in chapter 15. 

The exact form of the PD controller was 

*"<'> = <£w- <L1) 
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where k p and kd are constants, the proportional and derivative gains, respectively. 

Determining the shaded region shown in figure 1.4 required the solution of many global 
optimization problems in the variables k p and kd- We first used a numerical local op- 
timization method designed especially for parametrized controller design problems with 
RMS specifications; see, e.g., the survey by Makila, and Toivonen [MT87]. This produced 
a region that was likely, but not certain, to be the whole region of achievable specifications. 
To verify that we had found the whole region, we used the Routh conditions to determine 
analytically the region in the k p , kd plane that corresponds to stable closed- loop systems; 
this region was very finely gridded and the RMS actuator effort and regulation checked 
over this grid. This exhaustive search revealed that for this example, the local optimiza- 
tion method had indeed found the global minima; it simply took an enormous amount of 
computation to verify that the solutions were global. Of course, in general, local methods 
can miss the global minimum. (See the discussion in section 14.6.4.) 

A more sophisticated global optimization algorithm, such as branch-and-bound, could 
have been used (see, e.g., Pardalos and Rosen [PR87]). But all known global optimization 
algorithms involve computation that in the worst case grows exponentially with the number 
of variables. A similar five parameter global optimization problem would probably be 
computationally intractable. 
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Chapter 2 

A Framework for Control 
System Architecture 



In this chapter we describe a formal framework for what we described in chapter 1 
as the system to be controlled, the control configuration, and the control law or 
controller. 



2.1 Terminology and Definitions 

We start with a mathematical model of the system to be controlled that includes 
the sensors and actuators. We refer to the independent variables in this model as 
the input signals or inputs, and the dependent variables as the output signals or 
outputs. In this section we describe an important further division of these signals 
into those the controller can access and those it cannot. 

The inputs to the model include the actuator signals, which come from the 
controller, and other signals that represent noises and disturbances acting on the 
system. We will see in chapter 10 that it may be advantageous to include among 
these input signals some fictitious inputs. These fictitious inputs are not used to 
model any specific noise or disturbance; they allow us to ask the question, "what if 
a signal were injected here?". 

Definition 2.1: The inputs to the model are divided into two vector signals: 

• The actuator signal vector, denoted u, consists of the inputs to the model that 
can be manipulated by the controller. The actuator signal vector u is exactly 
the signal vector generated by the control processor. 

• The exogenous input vector, denoted w, consists of all other input signals to 
the model. 
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The number of actuator and exogenous input signals, i.e., the sizes of u and w, 
will be denoted n u and n w , respectively. 

Our model of the system must provide as output every signal that we care about, 
i.e., every signal needed to determine whether a proposed controller is an acceptable 
design. These signals include the signals we are trying to regulate or control, all 
actuator signals (u), all sensor signals, and perhaps important internal variables, 
for example stresses on various parts of a mechanical system. 

Definition 2.2: The outputs of the model consist of two vector signals: 

• The sensor signal vector, denoted y, consists of output signals that are acces- 
sible to the controller. The sensor signal y is exactly the input signal vector 
to the control processor. 

• The regulated outputs signal vector, denoted z, consists of every output signal 
from the model. 

The number of sensor and regulated output signals, i.e., the sizes of y and z, 
will be denoted n y and n z , respectively. 

We refer to the model of the system, with the two vector input signals w and u 
and the two vector output signals z and y, as the plant, shown in figure 2.1. Even 
though z includes the sensor signal y, we draw them as two separate signal vectors. 



exogenous inputs w 



actuator inputs u 




=»- z regulated outputs 



y sensed outputs 



Figure 2.1 The plant inputs are partitioned into signals manipulable by 
the controller («) and signals not manipulable by the controller (w). Among 
all of the plant outputs (z) are the outputs that the controller has access to 

(»)■ 

When the control system is operating, the controller processes the sensor signal y 
to produce the actuator signal u, as shown in figure 2.2. We refer to this connection 
of the plant and controller as the closed-loop system. The closed-loop system has 
input w and output z. 



2.1.1 Comparison to the Classical Plant 

Our notion of the plant differs from that used in classical control texts in several 
ways. First, our plant includes information about which signals are accessible to 
the controller, whereas in classical control, this is often side information given along 
with the classical plant in the controller design problem. For example, what would 
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Figure 2.2 The closed-loop system. The controller processes the sensed 
signals to produce the actuator signals. 



be called a "state-feedback" control system and an "output feedback" control system 
for a given classical plant, we describe as closed- loop systems for two different plants, 
since the sensed signals (our y) differ in the two systems. Similarly, the distinction 
made in classical control between one degree-of-freedom and two degree-of-freedom 
control systems is expressed in our framework as a difference in plants. 

Second, our plant includes information about where exogenous commands such 
as disturbances and noises enter the system. This information is also given as side 
information in classical control problems, if it is given at all. In classical control, the 
disturbances might be indicated in a block diagram showing where they enter the 
system; some important exogenous inputs and regulated variables are commonly left 
out, since it is expected that the designer will intuitively know that an acceptable 
design cannot excessively amplify, say, sensor noise. 

Similarly, our plant makes the signal z explicit — the idea is that z contains every 
signal that is important to us. In classical control texts we find extensive discussion 
of critical signals such as actuator signals and tracking errors, but no attempt is 
made to list all of the critical signals for a given problem. 

If w and z contain every signal about which we will express a constraint or 
specification, then candidate closed-loop systems can be evaluated by simulations 
or tests involving only the signals w and z. Thus, specifications (in the sense of a 
contract) for the control system could be written in terms of w and z only. 

We believe that the task of defining the signals u, w, y, and z for a system to 
be controlled is itself useful. It will help in forming sensible specifications for a 
controller to be designed and it helps in identifying the simulations that should be 
done to evaluate a candidate controller. 
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2.1.2 Command Inputs and Diagnostic Outputs 

The reader may have noticed that command signals entering the controller and diag- 
nostic signals produced by the controller do not appear explicitly in figure 2.2, even 
though they do in figures 1.1 and 2.3. In this section we show how these operator 
interface signals are treated in our framework. Our treatment of command signals 
simply follows the definitions above: if the command signal is directly accessible to 
the controller, then it is included in the signal y. There remains the question of how 
the command signals enter the plant. Again, we follow the definitions above: the 
command signals are plant inputs, not manipulable by the controller (they are pre- 
sumably manipulable by some external agent issuing the commands), and so they 
must be exogenous inputs, and therefore included in w. Often, exogenous inputs 
that are commands pass directly through the plant to some of the components of 
y, as shown in figure 2.4. Diagnostic outputs are treated in a similar way. 
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Figure 2.3 The controller may accept command signals as inputs and 
produce diagnostic and warning signals as outputs (see figure 1.1). This is 
described in our framework by including the command signals in w and y, 
and the diagnostic and warning signals in u and z, as shown in figure 2.4. 



2.2 Assumptions 

In this book we make the following assumptions: 

Assumption 2.1: The signals w, u, z, and y are real vector-valued continuous- 
time signals, i.e., functions from a real, nonnegative time variable into the appro- 
priately dimensioned vector space: 



w : R+ -» IC 



u : R + 
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^system 



2/system 
2/commands 



Figure 2.4 In our framework the command signals are included in w and 
pass through the plant to y, so that the controller has access to them. 
Similarly, diagnostic signals produced by the controller are included in u, 
and pass through the plant to z. 



Assumption 2.2: The plant is linear and time-invariant (LTI) and lumped, i.e., 
described by a set of constant coefficient linear differential equations with zero initial 
conditions. 



Assumption 2.3: The controller is also LTI and lumped. 

The differential equations referred to will be described in detail in section 2.5. 



2.2.1 Some Comments 
About assumption 2.2: 

• Many important plants are highly nonlinear, e.g., mechanical systems that 
undergo large motions. 

• Assumption 2.2 is always an approximation, only good for certain ranges of 
values of system signals, over certain time intervals or frequency ranges, and 
so on. 

About assumption 2.3: 

• Even if the plant is LTI, it is still a restriction for the controller to be LTI. 

• We have already noted that control systems that use digital control processors 
process sampled signals. These controllers are linear, but time-varying. 
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We wish to emphasize that our restriction to LTI plants and controllers is hardly 
a minor restriction, even if it is a commonly made one. Nevertheless, we believe the 
material of this book is still of great value, for several reasons: 

• Many nonlinear plants are well modeled as LTI systems, especially in regulator 
applications, where the goal is to keep the system state near some operating 
point. 

• A controller that is designed on the basis of an approximate linear model of 
a nonlinear plant often works well with the nonlinear plant, even if the linear 
model of the plant is not particularly accurate. (See the Notes and References 
at the end of this chapter.) 

• Some of the effects of plant nonlinearities can be accounted for in the frame- 
work of an LTI plant and controller. (See chapter 10.) 

• Linear control systems often form the core or basis of control systems designed 
for nonlinear systems, for example in gain-scheduled or adaptive control sys- 
tems. (See the Notes and References at the end of this chapter.) 

• A new approach to the control of nonlinear plants, called feedback lineariza- 
tion, has recently been developed. If feedback linearization is successful, it 
reduces the controller design problem for a nonlinear plant to one for which 
assumption 2.2 holds. (See the Notes and References at the end of this chap- 
ter.) 

• Even when the final controller is time- varying (e.g., when implemented on 
a digital control processor), a preliminary design or analysis of achievable 
performance is usually carried out under the assumption 2.3. This design or 
analysis then helps the designer select appropriate sample rates. 

• The results of this book can be extended to cover linear time- varying plants 
and controllers, in particular, the design of single-rate or multi-rate digital 
controllers. (See chapter 16.) 



2.2.2 Transfer Matrix Notation 

We will now briefly review standard notation for LTI systems. Consider an LTI 
system with a single (scalar) input a and a single (scalar) output b. Such a system 
is completely described by its transfer function, say G, so that 

B(s) = G(s)A(s), (2.1) 
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where A and B are the Laplace transforms of the signals a and b respectively: 

y»00 

A{s) = / a{t)e- st dt, 
Jo 

/»oo 

B{s) = / b{t)e- st dt. 
Jo 

Equivalently, the signals a and b are related by convolution, 

b(t) = [ g{T)a{t - t) dr, (2.2) 

Jo 

where g is the impulse response of the linear system, 

y»00 

G{s) = / g{t)e- st dt. 
Jo 

We will write (2.1) and (2.2) as 

b = Ga. (2.3) 

We will also use the symbol G to denote both the transfer function of the LTI 
system, and the LTI system itself. An interpretation of (2.3) is that the LTI system 
G acts on the input signal a to produce the output signal b. We say that G is the 
transfer function from a to b. 

Identical notation is used for systems with multiple inputs and outputs. For 
example, suppose that an LTI system has n inputs and m outputs. We collect the 
scalar input signals ai,...,a n to form a vector input signal a, 



a(t) = 



ai{t) 



a n (t) 



and similarly, we collect the output signals to form the vector signal b. The system 
is completely characterized by its transfer matrix, say G, which is an m x n matrix 
of transfer functions. All of the previous notation is used to represent the multiple- 
input, multiple-output (MIMO) system G and its vector input signal a and vector 
output signal b. 

2.2.3 Transfer Matrix Representations 

A consequence of assumption 2.2 is that the plant can be represented by a transfer 
matrix P, with a vector input consisting of the vector signals w and u, and a 
vector output consisting of the vector signals z and y. Similarly, a consequence of 
assumption 2.3 is that the controller can be represented by a transfer matrix K, 
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with a vector input y and vector output u. We partition the plant transfer matrix 
P as 



P = 



P P 

* yw x yu 



so that 



z = P zw w + P zu u, 

y = Py W W + Py U U. 



(2.4) 
(2.5) 



Thus, P zw is the transfer matrix from w to z, P zu is the transfer matrix from u 
to z, P yw is the transfer matrix from w to y, and P yu is the transfer matrix from u 
to y. This decomposition is shown in figure 2.5. We will emphasize this partitioning 
of the plant with dark lines: 



P = 



p 

r zw 


p 

* zu 


p 

-* yw 


p 



exogenous inputs w 



actuator inputs u 




z regulated outputs 



^- y sensor outputs 



Figure 2.5 The decomposed LTI plant. 

Now suppose the controller is operating, so that in addition to (2.4-2.5) we have 
u = Ky. (2.6) 

We can solve for z in terms of w to get 

Z = (P ZW + P ZU K{I - PyuK^Pyv,) W, 

provided det(7 — P yu K) is not identically zero, a well-posedness condition that we 
will always assume. 
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Definition 2.3: The closed-loop transfer matrix H is the transfer matrix from w 
to z, with the controller K connected to the plant P: 



H = P ZW + P ZU K(I - PyuK^P 



yw 



(2.7) 



Thus 



z = Hw. 



(2.8) 



The entries of the transfer matrix H are the closed-loop transfer functions from 
each exogenous input to each regulated variable. These entries might represent, for 
example, closed-loop transfer functions from some disturbance to some actuator, 
some sensor noise to some internal variable, or some command signal to some ac- 
tuator signal. The formula (2.7) above shows exactly how each of these closed-loop 
transfer functions depends on the controller K. 

A central theme of this book is that H should contain every closed-loop transfer 
function of interest to us. Indeed, we can arrange for any particular closed-loop 
transfer function in our system to appear in H, as follows. Consider the closed-loop 
system in figure 2.6 with two signals A and B which are internal to the plant. If our 
interest is the transfer function from a signal injected at point A to the signal at 
point B, we need only make sure that one of the exogenous signals injects at A, and 
that the signal at point B is one of our regulated variables, as shown in figure 2.7. 



w 



u 




Figure 2.6 Two signals A and B inside the plant P. 
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w 



u 




/ jy / / / / / / / / / / / 

/ ST / ////////// 




Figure 2.7 Accessing internal signals A and B from w and z. 



2.3 Some Standard Examples from Classical Control 

In this section we present various examples to illustrate the concepts introduced 
in this chapter. We will also define some classical control structures that we will 
periodically refer to. 



2.3.1 The Classical Regulator 

In this section we consider the classical single-actuator, single-sensor (SASS) reg- 
ulator system. A conventional block diagram is shown in figure 2.8. We use the 
symbol P to denote the transfer function of the classical plant. The negative sign 
at the summing junction reflects the classical convention that feedback should be 
"negative" . 
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—e 
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u 




Po 


Vp 
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1- =■ 













Figure 2.8 Conventional block diagram of a classical single- actuator, 
single-sensor regulator system. 



The signal e is called the error signal; it is the difference between the system 
output j/ p and the reference or desired output, which is zero for the regulator. The 
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goal of the regulator design is to keep y p small, and u not too large, despite the 
disturbances that act on the system. 

The conventional block diagram in figure 2.8 does not show the disturbances 
that act on the system, nor does it explicitly show which signals are of interest to 
us; this is side information in a conventional description of the regulator problem. 
To cast the classical regulator in our framework, we first add to the block diagram 
in figure 2.8 inputs corresponding to disturbances and outputs indicating the signals 
of importance to us. One way to do this is shown in figure 2.9. 
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Figure 2.9 Classical regulator system with exogenous inputs and regulated 
outputs. 

The disturbance n proc is an actuator-referred process noise] it is the signal that 
recreates the effect of system disturbances on the system output when it is added 
to the actuator signal. The disturbance n sen sor is a sensor noise. Even if this sensor 
noise is small, its existence in figure 2.9 emphasizes the important fact that the 
sensor signal (y p + n sensor ) is not exactly the same as the system output that we 
wish to regulate (y p ). 

The output signals we have indicated in figure 2.9 are the actuator signal u and 
the system output y p , since these signals will be important in any regulator design. 

The reader can think of the four explicit inputs and output signals we have added 
to the classical regulator system as required for a realistic simulation or required to 
evaluate a candidate controller. For a particular problem, it may be appropriate to 
include other input or output signals, e.g., the error signal e. 

We can now define the signals used in our framework. For the exogenous input 
vector we take the process noise n proc and the sensor noise n sensor , 



w = 



n 
n s 



proc 



We take the vector of regulated outputs to consist of the system output y p and the 
actuator signal u, 



z = 



Vv 
u 
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The control input of the plant is just the actuator signal u, and the sensed output 
is the negative of the system output signal corrupted by the sensor noise: 

U — \Up "■" ^sensor )• 

Note that this is the signal that enters the controller in figure 2.9. 

The plant has three inputs and three outputs; its transfer matrix is 



P = 



A block diagram of the closed-loop system is shown in figure 2.10. Using equa- 
tion (2.7), we find that the closed-loop transfer matrix H from w to z is 



p 

r zw 


p 

* zu 


p 

-* yw 


p 



Po 







Po 
1 


L -Po 


-1 


-Po J 



H = 



Po 



1 + PoK 
PoK 



PpK 

l + P K 

K 



1 + P K 1 + P K 



(2.9) 
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^proc 
^-sensor 




2/pl 



Figure 2.10 The closed- loop regulator. 
For future reference, we note the following terms from classical control: 



• L = PoK is called the loop gain. 

• S = 1/(1 + L) is called the sensitivity transfer function. 

• T = 1 — S is called the complementary sensitivity transfer function. 

Using these definitions, the classical designer might write the closed-loop transfer 
matrix H as 



H = 



SPo 
-T 



-T 

-T/P 
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Each entry of H, the 2x2 closed-loop transfer matrix from w to z, is significant. 
The first row consists of the closed-loop transfer functions from the process and 
sensor noises to the output signal y p ; our goal is to make these two transfer functions 
"small" in some appropriate sense. The "size" of these two transfer functions tells 
us something about the closed-loop regulation achieved by our control system. The 
second row consists of the closed- loop transfer functions from the process and sensor 
noises to the actuator signal, and thus is related to the actuator effort our control 
system uses. 

The idea is that H contains all the closed-loop transfer functions of interest in 
our regulator design. Thus, the performance of different candidate controllers could 
be compared by their associated i?'s, using (2.7); the specifications for a regulator 
design could be expressed in terms of the four transfer functions in H. 

2.3.2 The Classical 1-DOF Control System 

A simple extension of the regulator is the classical one degree- of -freedom (1-DOF) 
controller, shown in figure 2.11. In the 1-DOF control system, the reference signal, 
denoted r, is an external input that can change with time; the regulator is just 
the 1-DOF controller with the reference input fixed at zero. The basic goal in the 
design of the 1-DOF control system is to keep the system output y p close to the 
reference signal r, despite the disturbances n proc and n sen sor) while ensuring that 
the actuator signal u is not too large. The difference between the system output 
signal and the reference signal is called the tracking error, and denoted e: 

e = y p -r. 

(The tracking error is not shown in figure 2.11 or included in z, although it could 
be.) 
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Figure 2.11 Classical 1-DOF control system. 



We now describe the 1-DOF control system in our framework. As described in 
section 2.1.2 the reference input r is an exogenous input, along with the process 
and sensor noises, so we take the exogenous input vector to be 



w = 



^proc 
^-sensor 

r 
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The control input of the plant is again the actuator signal u, and we can take the 
vector of regulated outputs to be the same as for the regulator: 



z = 



u 



The sensed output y for the 1-DOF control system must be determined carefully. 
The controller in the 1-DOF control system does not have direct access to the 
corrupted system output, y p + n sensoi . Instead, the controller input is the tracking 
error signal corrupted by the sensor noise: 

U — ^ Up ^sensor* 

The plant has four inputs and three outputs; its transfer matrix is 
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[ -Po 


-1 
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-Po J 



p = 



The 1-DOF control system is shown in our framework in figure 2.12. 
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Figure 2.12 The 1-DOF control system in our framework. 

The closed-loop transfer matrix H now has three inputs and two outputs: 
Po P K P K 



H = 



1 + PoK 
P K 



1 + PoK 
K 



1 + PoK 
K 



1 + P K 1 + P K 1 + P K 



(2.11) 



The closed-loop transfer matrix H in this example consists of the closed-loop trans- 
fer matrix of the classical regulator, described in section 2.3.1, with a third column 
appended. The third column consists of the closed-loop transfer functions from 
the reference signal to the regulated variables. Its first entry, H\z, is the transfer 
function from the reference input to the system output, and is called the input- 
output (I/O) transfer function of the 1-DOF control system. It is the same as the 
complementary sensitivity transfer function. 
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2.3.3 The Classical 2-DOF Control System 

A generalization of the classical 1-DOF control system is the classical two degree- 
of-freedom (2-DOF) control system. A conventional block diagram of the 2-DOF 
control system is shown in figure 2.13. The key difference between the 1-DOF 
and 2-DOF control systems is that in the former, the controller processes only the 
corrupted error signal to produce the actuator signal, whereas in the latter, the 
controller has access to both the reference and the corrupted system output signals. 
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Figure 2.13 Conventional block diagram of a classical 2-DOF control sys- 
tem. 

To describe the 2-DOF control system in our framework, we take the same 
actuator, exogenous input, and regulated variables signals as for the 1-DOF control 
system: 



w = 



^proc 
^-sensor 

r 



Z = 



Vp 
u 



(2.12) 



The sensor signal for the 2-DOF control system is not the same as for the 1-DOF 
control system; it is 



y = 



2/p ^sensor 

r 



y 

r 



(2.13) 



The 2-DOF control system is shown in our framework in figure 2.14. The plant 
has four inputs and four outputs; its transfer matrix is 



P = 



p 


P 

-* zu 


p 


p 



r Po 








Po 











1 


-Po 


-1 





-Po 








1 






(2.14) 



The controller K in the 2-DOF control system has two inputs and one output. 
If we write its transfer matrix as 



K = [ K y K r ] 



(2.15) 
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Figure 2.14 The closed-loop 2-DOF system. 



(the subscripts remind the reader of the interpretation), then the closed-loop trans- 
fer matrix H is 



H = 



Po 



1 + P K y 

P Ky 



P K y 



Ky 



PoK r 



1 + P Ky 1 + P Ky 



1 + P Ky 1 + P Q Ky 1 + P Q Ky 



(2.16) 



Just as in the 1-DOF control system, the closed-loop transfer matrix H consists 
of the closed-loop transfer matrix of the classical regulator described in section 2.3.1 
with a third column appended. The interpretations of the elements of H are the 
same as in the 1-DOF control system, and hence we can compare the two control 
systems. If K y = K r , then the closed-loop transfer matrix for the 2-DOF control 
system is the same as the closed-loop transfer matrix for the 1-DOF control system. 
Thus, the 2-DOF control system is more general than the 1-DOF control system. 



2.3.4 2-DOF Control System with Multiple Actuators and Sensors 

The standard examples described in the previous sections can be extended to plants 
with multiple actuators and multiple sensors (MAMS), by interpreting the various 
signals as vector signals and expressing the transfer functions as the appropriate 
transfer matrices. As an example, we describe the MAMS 2-DOF control system. 

The block diagrams shown in figures 2.13 and 2.14 can describe the MAMS 
2-DOF control system, provided we interpret all the (previously scalar) signals as 
vector signals. The signals w, u, z, and y are given by the expressions (2.12) 
and (2.13); the plant transfer matrix is essentially the same as (2.14), with the ones 
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and zeros replaced by the appropriate identity and zero matrices: 





Po 


VngXriy 


VngXn r 


Po 

T"1i X It'll 




"n u xn u 


Vn u xn$ 


Vn u xn r 




-Po 


— ■LnyXng 


^ngXn r 


-Po 




Un r xn„ 


vn r xn$ 


ln r xn r 


Dn r xn. 



(2.17) 



(the subscripts indicate the sizes; n r is the size of the reference signal r, and need 
not be the same as n§, the size of the y p ). In the sequel we will not write out the 
sizes so explicitly. 

We partition the n u xn y transfer matrix K as in (2.15), such that Kg is an n u xn$ 
transfer matrix, and K r is an n u x n r transfer matrix (note that n y = ng + n r ). 
The closed- loop transfer matrix H is a generalization of (2.16): 



H = 



SP 
KySPo 



-P K § S SP K r 
-KyS SK r 



(2.18) 



where 



S = (I + P K y )-\ 
S = (I + K y P )- 1 . 

S is called the sensitivity matrix of the MAMS 2-DOF control system. To distin- 
guish it from S, S is sometimes called the output-referred sensitivity matrix, for 
reasons that will become clear in chapter 9. The I/O transfer matrix of the MAMS 
2-DOF system is 

T = (I + PoKy^PoKr = P (I + KyPo)- 1 ^, 

which is the closed-loop transfer matrix from the reference input r to the system 
output signal y p . 



2.4 A Standard Numerical Example 

In this section we describe a particular plant and several controllers that will be 
used in examples throughout this book. We consider the 1-DOF control system, 
described in section 2.3.2, with 

P _ pstd A 1 10 - a 

Pg td consists of a double integrator with some excess phase from the allpass term 
(10 — s)/(10 + s), which approximates a 0.2 second delay at low frequencies. 
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In various future examples we will encounter the controllers 



K^ a \s] 



a 44.14s 2 + 107.3s + 39 
~ s 3 + 10s 2 + 55.25s + 78.14' 
a 219.6s 2 + 1973.95s + 724.5 
~ s 3 + 19.15s 2 + 105.83s + 965.95' 
a 95.05s 2 + 975s + 244.95 
~ s 3 + 23.91s 2 + 185.87s + 824.84' 
a 35.28s 2 + 360.52s + 77.46 
~~ s 3 + 19.38s 2 + 132.8s + 481.0' 



The corresponding closed-loop transfer matrices, which we denote H^ a \ H^ h \ H^ c \ 
and i?( d ) respectively, can be computed from (2.11). The closed-loop systems that 
result from using the controllers K^ a \ K^ h \ K^ c \ and K^ can be compared by 
examining the 2x3 transfer matrices H^ a \ H^ h \ H^ c \ and H^- d \ For example, 
figure 2.15 shows \H$(jw)\, \h[z\jw)\, \H$(jw)\, and \H$(jw)\, i.e., the mag- 
nitudes of the closed-loop transfer functions from n sen sor to y p . From this figure, 
we can conclude that a high frequency sensor noise will have the greatest effect on 
y p in the closed- loop system with the controller K^- h \ and the least effect in the 
closed- loop system with controller K^ d \ 
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0.001 




u> 
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Figure 2.15 Magnitudes of the closed-loop transfer functions from n s , 
to y p for the four different closed-loop transfer matrices. 
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2.5 A State-Space Formulation 

We will occasionally refer to state-space realizations of the plant, controller, and 
closed-loop system. The general multiple-actuator, multiple-sensor (MAMS) plant 
P takes two vectors of input signals (w and u) and produces two vectors of output 
signals (z and y); a state-space realization of the plant is thus: 

x = Apx + B w w + B u u 



z = C z x + D zw w + D zu u 
y = CyX + D yw w + D yu u, 
(with x(0) = 0), so that 



(2.19) 
(2.20) 
(2.21) 



P(s) = 



' Pzw{s) 


Pzu{s) ' 


. *yw\ s ) 


Pyu( s ) . 



= Cp{sI-Ap)- 1 B P +D 



p, 



where 



B P = 


B w 


B u ] 


C P = 


' c z ' 

Cy 




D P = 


*-* ZW 
Uyw 


D zu ' 

Uyu 



Many plant models encountered in practice have the property that D yu = 0, or 
equivalently, P yu {oo) = 0; such plants are called strictly proper. For strictly proper 
plants, the state-space formulas that we will encounter are greatly simplified, so we 
make the following assumption: 

Assumption 2.4: The plant is strictly proper: D yu = (i.e., P yu (oo) = 0). 

This assumption is not substantial; we make it for convenience and aesthetic reasons. 
The more complicated state-space formulas for the case D yu ^ can be found in 
the Notes and References that we cite at the end of each chapter. 
Suppose that our controller has state-space realization 



xr = A K x K + B K y 
u = C K x K + DkV, 



(2.22) 
(2.23) 



so that 



K(s) = C K (sI - A K y x B K + D K . 

A state-space realization of the closed-loop system can be found by eliminating u 
and y from (2.19-2.21) and (2.22-2.23): 



x = {A P + B u D K C y )x + B u C K x K + {B w + B U D K D 
xk = B K CyX + A K x K + B K D 



yw 



)W 



yw 



W 



z = {C Z + D zu D K C y )x + D zu C K x K + (D zw + D zu D K D yw )w 



(2.24) 
(2.25) 
(2.26) 
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so that 



H{s) = C H {sI - Ah^Bh + D H , 



(2.27) 



where 



A H = 



Ap + B u DkC v B u Ck 



BrC v 



ik 



B w + B u D K D yw 

BuDyw 

Ch = [ C z + D zu DkC v D zu Ck \ 
D H = D zw + D zu D K D yw . 

The state-space realization of the closed-loop system is shown in figure 2.16. 




Figure 2.16 State-space realizations of the plant and controller connected 
to form a state-space realization of the closed-loop system. 
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Notes and References 

Standard Descriptions and Nomenclature for Control Systems 

In 1954, the Feedback Control Systems Committee of the AIEE approved an AIEE Pro- 
posed Standard of Terminology for Feedback Control Systems, which included a standard 
for the block diagram (see [AIE51] and [NGK57]). 

Two-Input, Two-Output Plant Description 

The description of a plant that explicitly shows the exogenous input and the regulated 
variables is now common, and the symbols w, u, z, and y are becoming standard. They 
appear, for example, in Francis [Fra87]. Nett [Net86] explicitly discusses the command 
inputs to the controller and the diagnostic outputs from the controller, which he denotes 
w' and z' , respectively (see figures 2.3 and 2.4). 

An example of a diagnostic output is the output prediction error of an observer; a sub- 
stantial increase in the size of this signal may indicate that the plant is no longer a good 
model of the system to be controlled, i.e., some system failure has occurred. This idea is 
treated in the survey by Isermann [Ise84]. 

Nonlinear Controllers Based on LTI Controllers 

Several methods for designing nonlinear controllers for nonlinear plants are based on the 
design of LTI controllers, e.g., 

• Linear variational method. An LTI controller is designed for an approximate LTI 
model of a nonlinear plant (near some equilibrium point). We will see a specific 
example of this in chapter 10. 

• Gain scheduling. For a family of equilibrium points, approximate LTI models of a 
nonlinear plant are developed, and for each of these LTI models an LTI controller 
is designed. On the basis of the sensed signals, y, an estimate is made of the 
equilibrium point that is "closest" to the current plant state, and the corresponding 
LTI controller is "switched in". The overall controller thus consists of a family of 
LTI controllers along with a selection algorithm; this overall controller is nonlinear. 

In cases where a sensor can readily or directly measure this "closest" equilibrium 
point, and the family of LTI controllers differ only in some parameter values or 
"gains" , the resulting controller is called gain scheduled. For example, many aircraft 
controllers are gain scheduled; the equilibrium points might be parametrized by 
altitude and airspeed. 

• Adaptive control. An adaptive controller uses an identification procedure based 
on the signals y and u to estimate the "closest" equilibrium point, or other system 
parameters. This estimate is often used for gain scheduling. See for example [AW89, 
SB88, GS84, ABJ86]. 

• Feedback linearization. In many cases, it is possible to construct a preliminary 
nonlinear feedback that makes the plant, with the preliminary feedback loop closed, 
linear and time-invariant, and thus amenable to the methods of this book. This idea 
is studied in the field of geometric control theory; see for example [HSM83]. A clear 
exposition can be found in chapters 4 and 5 of the book by Isidori [Isi89], which 
also has many references. 
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Control System Architectures 

A description of different control architectures can be found in, e.g., Lunze [Lun89, p31], 
Anderson and Moore [AM90, p212], and Maciejowski [Mac89, §1.3]. Discussions of 
control architectures appear in the articles [DL85] and [Net86]. 

We noted that the 2-DOF control system is more general than the 1-DOF control system, 
and yet the 1-DOF control system architecture is widely chosen. In some cases, the 
actual sensor used is a differential sensor, which directly senses the difference between 
the reference and the system output, so that the 1-DOF plant accurately reflects signals 
accessible to the controller. But in many cases, the system output signal and the reference 
input signal are separately sensed, as in the 2-DOF controller, and a design decision is 
made to have the controller process only their difference. 

One possible reason for this is that the block diagram in figure 2.11 has been widely used 
to describe the general feedback paradigm. In The Origins of Feedback Control [May70], 
for example, Mayr defines feedback control as forming u from the error signal e (i.e., the 
1-DOF control system). It would be better to refer to the general principle behind the 
1-DOF control system as the error-nulling paradigm, leaving feedback paradigm to refer to 
the more general scheme depicted in figure 2.2. 

The Standard Example Plant 

The idea of using a double integrator plant with some excess phase as a simple but realistic 
typical plant with which to explore control design tradeoffs is taken from a study presented 
by Stein in [FBS87]. A discretized version of our standard example plant is considered in 
Barratt and Boyd [BB89]. 

State-Space Descriptions 

Comprehensive texts include Kailath [Kai80], Chen [Che84]; an earlier text is Zadeh and 
Desoer [ZD63]. The appendices in the book by Anderson and Moore [AM90] contain all 
of the material about state-space LTI systems that is needed for this book. 



Chapter 3 

Controller Design 
Specifications and Approaches 



In this chapter we develop a unified framework for describing the goals of con- 
troller design in terms of families of boolean design specifications. We show how 
various approaches, e.g., multicriterion optimization or classical optimization, can 
be described in this framework. 

Just as there are many different architectures or configurations for control systems, 
there are many different general approaches to expressing the design goals and 
objectives for controller design. One example is the optimal controller paradigm: the 
goal is to determine a controller that minimizes a single cost function or objective. 
In another approach, multicriterion optimization, several different cost functions are 
specified and the goal is to identify controllers that perform mutually well on these 
goals. The purpose of this chapter is to develop a unified framework for describing 
design specifications, and to explain how these various approaches can be described 
using this framework. 

Throughout this chapter, we assume that a fixed plant is under consideration: 
the signals w, u, z, and y have been defined, and the plant transfer matrix P has 
been determined. As described in chapter 2, the definitions of the exogenous input 
w and the regulated variables z should contain enough signals that we can express 
every goal of the controller design in terms of H, the closed-loop transfer matrix 
from id toz. 



3.1 Design Specifications 

We begin by defining the basic or atomic notion of a design specification: 

47 
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Definition 3.1: A design specification V is a boolean function or test on the closed- 
loop transfer matrix H. 

Thus, for each candidate closed-loop transfer matrix H, a design specification V is 
either satisfied or not satisfied: design specifications are simple tests, with a "pass" 
or "fail" outcome. A design specification is a predicate on transfer matrices, i.e., a 
function that takes an n z x n w transfer matrix as argument and returns a boolean 
result: 

V : n — > {PASS, FAIL} , 

where Ji denotes the set of all n z x n w transfer matrices. 

It may seem strange to the reader that we express design specifications in terms 
of the closed-loop transfer matrix H instead of the controller transfer matrix K , 
since we really design the controller transfer matrix K, and not the resulting closed- 
loop transfer matrix H. Of course, to each candidate controller K there corresponds 
the resulting closed-loop transfer matrix H (given by the formula (2.7) in defini- 
tion 2.3), which either passes or fails a given design specification; hence, we can 
think of a design specification as inducing a boolean test on the transfer matrices of 
candidate controllers. In fact, we will sometimes abuse notation by saying "the con- 
troller K satisfies V", meaning that the corresponding closed- loop transfer matrix 
H satisfies V. 

Such a PASS/FAIL test on candidate controllers would be logically equivalent 
to the design specification, which is a PASS/FAIL test on closed-loop transfer ma- 
trices, and perhaps more natural. We will see in chapter 6, however, that there are 
important geometric advantages to expressing design specifications in terms of the 
closed-loop transfer matrix H and not directly in terms of the controller K . 

3.1.1 Some Examples 

Some possible design specifications for the standard example described in section 2.4 
are: 

• Maximum step response overshoot. V os will denote the design specification 
"the step response overshoot from the reference signal to y p is less than 10%". 

Let us express V os more explicitly in terms of H, the 2x3 closed-loop transfer 
matrix. The reference signal r is the third exogenous input (w 3 ), and y p is the 
first regulated variable (zi), so H i3 is the closed- loop transfer function from 
the reference signal to y p , and its step response is given by 



W~^3<>) 

y) 27r,/_ 00 jw 



3.1 Design Specifications 49 

for t > 0. Thus, D os can be expressed as 
#13 (jV 



1 r 



JU 



e jwt du < 1.1 for all t > 0. 



• Maximum RMS actuator effort. 2> acte fr will denote the design specification: 
"the RMS value of the actuator signal due to the sensor and process noises is 
less than 0.1". 

We shall see in chapters 5 and 8 that V act ^f[ can be expressed as 

1 f°° 
D ac t_efr : — / {\H 21 {ju)\ 2 S pioc {u) + \H 22 {ju)\ 2 S sensoi {u)) du <0.1 2 , 

where S pTOC and S sensoT are the power spectral densities of the noises n proc 
and n sensor , respectively. 

• Closed-loop stability. D sta bie will denote the design specification: "the closed- 
loop transfer matrix H is achieved by a controller that stabilizes the plant". 
(The precise meaning of D sta bie can be found in chapter 7.) 

A detailed understanding of these design specifications is not necessary in this 
chapter; the important point here is that given any 2x3 transfer matrix H, each 
of the design specifications V os , V act ^f[, and 2> staD ie is either true or false. This 
is independent of our knowing how to verify whether a design specification holds 
or not for a given transfer matrix. For example, the reader may not yet know of 
any explicit procedure for determining whether a transfer matrix H satisfies D staD i e ; 
nevertheless it is a valid design specification. 

These design specifications should be contrasted with the associated informal 
design goals "the step response from the reference signal to y p should not overshoot 
too much" and "the sensor and process noises should not cause u to be too large". 
These are not design specifications, even though these qualitative goals may better 
capture the designer's intentions than the precise design specifications V os and 
Dact^eff- Later in this chapter we will discuss how these informal design goals can 
be better expressed using families of design specifications. 

3.1.2 Comparing and Ordering Design Specifications 

In some cases, design specifications can be compared. We say that design speci- 
fication Vi is tighter or stronger than V 2 (or, V 2 is looser or weaker than Vi) if 
all transfer matrices that satisfy T>\ also satisfy T> 2 . We will say that T>\ is strictly 
tighter or strictly stronger than T> 2 if it is tighter and in addition there is a transfer 
matrix that satisfies T> 2 but not T>\. 

An obvious but extremely important fact is that given two design specifications, 
it is not always possible to compare them: it is possible that neither is a stronger 
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specification. The ordering of design specifications by strength is a partial ordering, 
not a linear or total ordering. We can draw a directed graph of the relations between 
different specifications by connecting strictly weaker specifications to stronger ones 
by arrows, and deleting the arrows that follow from transitivity (i.e., T>\ is stronger 
than 2> 2 which is stronger than V 3 implies T>i is stronger than V 3 ). An example 
of a linear ordering is shown in figure 3.1(a): every pair of specifications can be 
compared since there is a directed path connecting any two specifications. By 
comparison, a partial but not linear ordering of design specifications is shown in 
figure 3.1(b). Specification 2>d is tighter than Vq, which in turn is tighter than 
both Da and Vb- Similarly, specification T>q is tighter than 2>e, which is itself 
tighter than T>c- However, specifications 2>e and 2>d cannot be compared; nor can 
the design specifications Vq and X>d- 







(a) (b) 

Figure 3.1 In these directed graphs each node represents a specification. 
Arrows connect a weaker specification to a tighter one. The graph in (a) 
shows a linear ordering: all specifications can be compared. The graph 
in (b) shows a partial, but not linear, ordering: some specifications can 
be compared (e.g., T>r> is tighter than X>a), but others cannot (e.g., Dp is 
neither weaker nor tighter than T>r>). 

We may form new design specifications from other design specifications using 
any boolean operation, for example conjunction, meaning joint satisfaction: 

Di A X> 2 : H satisfies Vi and V 2 . 



The new design specification T>\ A T>2 is tighter than both T>\ and T>i . 

We say that a design specification is infeasible or inconsistent if it is not satisfied 
by any transfer matrix; it is feasible or consistent if it is satisfied by at least one 
transfer matrix. We say that the set of design specifications \T>\, . .. , X>^} is jointly 
infeasible or jointly inconsistent if the conjunction T>\ A • • • A T>i, is infeasible; if the 
conjunction T>\ A • • • AVl is feasible, then we say that the set of design specifications 
{X>i, . . . , T>l} is jointly feasible or achievable. 
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3.2 The Feasibility Problem 

Given a specific set of design specifications, we can pose the feasibility problem of 
determining whether all of our design specifications can be simultaneously satisfied: 

Definition 3.2: Feasibility controller design problem: given a set of design speci- 
fications {T>i, . . . , T>l}, determine whether it is jointly feasible. If so, find a closed- 
loop transfer matrix H that meets the design specifications T>\ , . . . , T>l ■ 

Like design specifications, the feasibility problem has a boolean outcome. If this 
outcome is negative, we say the set of design specifications \T>\, . . . , V^} is too tight, 
infeasible, or unachievable; if the outcome of the feasibility problem is positive, we 
say that \T>\, .. ., T>l} is achievable or feasible. It is only for ease of interpretation 
that we pose the feasibility problem in terms of sets of design specifications, since 
it is equivalent to feasibility of the single design specification T>\ A • • • A T>l- 

The feasibility problem is not meant by itself to capture the whole controller 
design problem; rather it is meant to be a basic or atomic problem that we can 
use to describe other, more subtle, formulations of the controller design problem. 
In the rest of this chapter we will describe various design approaches in terms of 
families of feasibility problems. This is exactly like our definition of the standard 
plant, which is meant to be a standard form in which to describe the many possible 
architectures for controllers. The motivation is the same — to provide the means to 
sensibly compare apparently different formulations. 

3.3 Families of Design Specifications 

A single fixed set of design specifications usually does not adequately capture the no- 
tion of "suitable" or "satisfactory" system performance, which more often involves 
tradeoffs among competing desirable qualities, and specifications with varying de- 
grees of hardness. Hardness is a quality that describes how firmly the designer 
insists on a specification, or how flexible the designer is in accepting violations of 
a design specification. Thus, the solution of a single, fixed, feasibility problem may 
be of limited utility. These vaguer notions of satisfactory performance can be mod- 
eled by considering families of related feasibility problems. The designer can then 
choose among those sets of design specifications that are achievable. 

To motivate this idea, we consider again the specifications described in sec- 
tion 3.1.1. Suppose first that the solution to the feasibility problem with the set of 
specifications {V os , V act _ e f[, Stable} is affirmative, and that H simultaneously sat- 
isfies V os , 2>act^eff, and 2> st abie- If the set of specifications {V os , 2> ac t_eff, ^stable} ad- 
equately captures our notion of a satisfactory design, then we are done. But the de- 
signer may wonder how much smaller the overshoot and actuator effort can be made 
than the original limits of 10% and 0.1, respectively. In other words, the designer 
will often want to know not just that the set of specifications {V os , 2> ac t_eff> ^stable} 
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is achievable, but in addition how much these specifications could be tightened while 
remaining achievable. 

A similar situation occurs if the solution to the feasibility problem with the 
design specifications {V os , 2} ac t_eff> ^stable} is negative. In this case the designer 
might like to know how "close" this set of design specifications is to achievable. For 
example, how much larger would the limit on overshoot have to be made for this 
set of design specifications to be achievable? 

Questions of this sort can be answered by considering families of design spec- 
ifications, which are often indexed by numbers. In our example above, we could 
consider a family of overshoot specifications indexed, or parametrized, by the al- 
lowable overshoot. For a\ > we define 

X> s : overshoot of step response from r to y p < a\%. 

Note that D s is a different specification for each value of a-\_. Similarly, a family 
of actuator effort specifications for a 2 > is 

^act eff • RMS deviation of u due to the sensor and actuator noises < a 2 . 
Thus for each a\ > and a2 > we have a different set of design specifications, 

U — U os A ^act_eff A ''stable • 

Suppose we solve the feasibility problem with design specifications T>( ai > a2 ) for 
each a\ and a 2 , shading the area in the (01,02) plane corresponding to feasible 
2)(ai,a 2 ) ) as shown in figure 3.2. We call the shaded region in figure 3.2, with some 
abuse of notation, a plot of achievable specifications in performance space. 

From this plot we can better answer the vague questions posed above. Our 
original set of design specifications corresponds to a\ = 10% and a 2 = 0.1, shown 
as T>x in figure 3.2. From figure 3.2 we may conclude, for example, that we may 
tighten the overshoot specification to a± = 6% and still have an achievable set 
of design specifications (Vy in figure 3.2), but if we further tighten the overshoot 
specification to a\ = 4%, we have a set of specifications (Vz) that is too tight. 
Alternatively, we could tighten the actuator effort specification to a 2 = 0.06 and 
also have an achievable set of design specifications (X>w)- 

By considering the feasibility problem for families of design specifications, we 
have considerably more information than if we only consider one fixed set of specifi- 
cations, and therefore are much better able to decide what a satisfactory design is. 
We will return to a discussion of this general idea, and this particular example, after 
we describe a general and common method for parametrizing design specifications. 

3.4 Functional Inequality Specifications 

The parametrized families D s and 2> a "j! eff described above have the same general 
form. Each of these families involves a particular quantity (overshoot and actuator 
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Figure 3.2 A plot of achievable specifications in performance space: the re- 
gion where the set of design specifications {2?os , 2?a"t jff> X> s tabie} is achiev- 
able is shaded. The specification X>x corresponds to a\ = 10% and ai = 0.1, 
the original design specifications. The specification T>y represents a tighten- 
ing of the overshoot specification over 2?x to a\ = 6% and is an achievable 
specification, whereas the further tightening of the overshoot specification 
to a\ = 4% (X>z) is not achievable. 



effort, respectively) for which a smaller value is "better", or at least, a tighter 
specification. The parameter (ai and a,2, respectively) is simply a limit on the 
associated quantity. 

This notion of "quantity" we make more precise as a functional on transfer 
matrices. A functional is just a function that assigns to each n z x n w transfer 
matrix H a real number (or possibly oo). 

Definition 3.3: A functional <j> (on transfer matrices) is a function 
<j> : H -> RU{oo}. 



In contrast to a design specification, which a transfer matrix either passes or fails, 
a functional can be thought of as measuring some property or quality of the transfer 
matrix, without judgment. Many functionals can be interpreted as measuring the 
size of a transfer matrix or some of its entries (the subject of chapters 4 and 5). 
Allowing functionals to return the value +oo for some transfer matrices will be 
convenient for us. 
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Two junctionals related to our example above are the overshoot, 



4>os{H) = sup — / : e JWT du> - 1 , 

t>0 \ 27r ./-oo J w / 



and the RMS response at u due to the sensor and actuator noises, 

(1 /"OO \ 1/2 

— / (\H 2 i{ju)\ 2 S pioc {u) + \H 2 2{ju)\ 2 S sensoi (uj)) dw J . 

The families of design specifications 2> s and ^ct-eff can ^ e ex P resse d as the 
functional inequality specifications: 

V<£> : <f> os (H)<a 1 , 

and 

2>Seff : factMH) < a 2 . 
More generally, we have: 
Definition 3.4: A functional inequality specification is a specification of the form 

V^ : <f>(H)<a, (3.1) 

where <j> is a functional on transfer matrices and a is a constant. 

Thus a can be interpreted as the maximum allowable value of the functional <j>\ by 
varying a, the expression (3.1) sweeps out a family of design specifications. 

The family of specifications given by a functional inequality (3.1) is linearly 
ordered: T>^' is a stronger specification than T>\ ' if a < b. A functional can also 
allow us to make quantitative comparisons of the specifications T>^' and T>\ ' . For 

example D^ c j ^ ff is not merely a tighter specification than ^ct^eff' we can sa y ^ a ^ 
it is twice as tight, at least as measured by the functional (/> ac t_eff- 

In the next two sections we describe two common methods for capturing the 
notion of a "suitable" design using families of feasibility problems. 

3.5 Multicriterion Optimization 

In multicriterion optimization, we have a hard constraint Dhard an d objective func- 
tionals or criteria <j)\,. . -,<J>l- Each objective represents a soft goal in the design: 
if the closed- loop transfer matrix H satisfies Dhardj it is generally desirable to have 
each (f>i(H) small, although we have not (yet) set any priorities among the objective 
functionals. 
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We consider the family of design specifications, indexed by the parameters 
ai,...,a L , given by 



J)(«lV"l«l) 



^hardA^ 



(»i) 



A 



AV 



(»*) 



(3.2) 



^hard is the conjunction of our hard constraints — performance specifications about 
which we are inflexible. Every specification we consider, 2?( ai ' - "' a£ ), is stronger than 
^hard- The remaining specifications are functional inequalities for the criteria. The 
basic goal of multicriterion optimization is to identify specifications T>^ ai, '" ,aL ^ that 
are on the boundary between achievable specifications and unachievable specifica- 
tions: 

Definition 3.5: The specification J)^ ai, "' ,aL ^ is Pareto optimal or noninferior if 
the specification D' aiv "' aL ' is achievable whenever a± > ai,...,fii > a^, and is 
unachievable whenever a,\ < ai,...,fii < a^. A closed-loop transfer matrix that 
satisfies a Pareto optimal specification is called a Pareto optimal transfer matrix or 
design. 

This is illustrated in figure 3.3 for our particular example. The points on the 
boundary between the achievable and unachievable specifications, shown with a 
dashed line in figure 3.3, are precisely the Pareto optimal specifications. 
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Figure 3.3 A specification is Pareto optimal if smaller limits in the func- 
tional inequalities yield unachievable specifications and larger limits in the 
functional inequalities yield achievable specifications. Such specifications 
and on the boundary between achievable and unachievable specifications. 
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Figure 3.4 The region of achievable specifications that are tighter than the 
specification marked T>y is shaded, and similarly for 2?w- The specifications 
T>a and T>b are Pareto optimal. 
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Let us illustrate this concept further. Figure 3.2 is redrawn in figure 3.4 with the 
specifications Vy and 2>w shown, along with two new specifications, 2>a and X>b- 
The region of achievable specifications that are tighter than the specification Vy is 
shaded, and similarly for 2?w To see the significance of these regions, consider a 
closed-loop transfer matrix H with ((f> os (H), <f) a ct_eff{H)) lying in the shaded region 
corresponding to Vy, and a closed-loop transfer matrix H with <f) os (H) = 6%, 
<f>act_eff{H) = 0.1 (i.e., H just satisfies Vy). H is clearly a better design, in the 
sense that it has less overshoot and it has a lower actuator effort than H. Vy is not 
Pareto optimal precisely because there are designs that are better in both overshoot 
and actuator effort. In general, an achievable but not Pareto optimal specification 
is one that can be strictly strengthened. The specifications 2>a an d V^ are Pareto 
optimal. 

Yet another interpretation of the Pareto optimal specifications is in terms of the 
partial ordering of the design specifications. Figure 3.5 shows the partial ordering 
of the design specifications from figures 3.2 and 3.4, together with the boundary 
between achievable and unachievable specifications. For each linearly ordered chain 
of specifications that contains a Pareto optimal specification, all the weaker spec- 
ifications will be achievable and all the tighter specifications will be unachievable. 
In terms of the partial ordering among specifications by strength, Pareto optimal 
specifications are the minimal achievable specifications. 

The boundary between achievable and unachievable specifications is called the 
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Figure 3.5 A directed graph showing the ordering (in the sense of strength) 
between the specifications in figures 3.2 and 3.4. The boundary between 
achievable and unachievable specifications is shown with a dashed line. 

tradeoff curve (between the objectives <f) os and (/> ac t_eff)- We say that the specifica- 
tion X>a trades off lower overshoot against higher actuator effort, as compared to 
Db . More generally, with L specifications, we have a tradeoff surface (between the 
functionals fa, . . . , <j>l) in performance space. 

The idea of a tradeoff among various competing objectives is very important. 
Figure 3.6 shows two different possible regions of achievable specifications. In plot 
(a), the tradeoff curve is nearly straight, meaning that the overshoot and actuator 
effort are tightly coupled, and that we must "give up on one to improve the other". 
In plot (b), the tradeoff curve is quite bent, which means that we can do very well 
in terms of actuator effort and overshoot simultaneously: we give up only a little 
bit in each objective to do well in both. In this case we might say that the two 
functionals are nearly independent. 

We note that Pareto optimal specifications themselves may be either achievable 
or unachievable. If a Pareto optimal specification is unachievable, however, there 
are arbitrarily close specifications that are achievable, so, in practice, it is irrelevant 
whether or not the Pareto optimal specifications can be achieved. 

3.6 Optimal Controller Paradigm 

In the classical optimization paradigm a family of design specifications indexed by 
a single parameter a is considered: 

^ (a) : IUAPS (3-3) 

where D bj is the functional inequality specification 

V<£] ■■ 4> obi (H)<a. 

As in multicriterion optimization, 2>hard is the conjunction of our hard constraints. 
Unlike general multicriterion optimization, the family of specifications we consider, 
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Figure 3.6 Two possible tradeoff curves. In (a), the designer must trade 
off any reduction in actuator effort with an increase in the step response 
overshoot, and vice versa. In (b), it is possible to have low overshoot and 
actuator effort; the specification at the knee of the tradeoff curve is not 
much worse than minimizing each separately. 
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T>( a \ is linearly ordered. Any two specifications of this form can be compared; 2)( a ) 
is tighter than V^ if a < (3. 

We simply seek the tightest of these specifications that is achievable; more pre- 
cisely, we seek the critical value a cr i t such that 2)( a ) is achievable for a > a cr i t and 
unachievable for a < a CI u. Of course, this is the unique Pareto optimal specifica- 
tion if we consider this a multicriterion optimization problem with one criterion. A 
transfer matrix H opt that minimizes (/) bj(-ff), that is, satisfies T>( aciit \ is called a 
</>obj -optimal design. 

The classical optimization paradigm is more commonly expressed as 



a crit = min (p ob j(H), 

H Satisfies 2>hard 



#opt — 



argmin <p ohi (H), 

H satisfies 2?hard 



(3.4) 
(3.5) 



where the notation argmin denotes a minimizer of (/> bj> *.e., a cr i t = </> bj(-ffopt)- 

While the classical optimization paradigm is formally a special case of multi- 
criterion optimization, it is used quite differently. The objective functional (/> bj is 
usually not just one of many single soft goals, as is each objective functional in mul- 
ticriterion optimization; rather it generally represents some sort of combination of 
the many important functionals into one single functional. In the following sections 
we survey some of the common methods used to combine functionals into the single 
objective functional <f) by 
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3.6.1 Weighted-Sum Objective 

One common method is to add the various functionals, after they have been multi- 
plied by weights. We form 

<t>obj(H) = Xifa(H) + ■■■ + \l4>l{H), (3.6) 

where Aj are nonnegative numbers, called weights, which assign relative "values" 
among the functionals fa. We refer to an objective functional of the form (3.6) 
as a weighted-sum objective; the vector A with components Aj is called the weight 
vector. One method for choosing the weights is to scale each functional by a typical 
or nominal value: 

1 
A,- = 



c om 

where (/>" om represents some nominal value of the functional fa. We can think of 
these nominal values as including the (possibly different) physical units of each 
functional, so that each term in the sum (3.6) is dimensionless (and hence, they can 
be sensibly added). 

For our example the designer might choose 

C m = 10% = o.i, CcTefr = 0.05, 

so that 

Ai = — = 10, A 2 = — = 20, (3.7) 

0.1 ' 0.05 ' v ; 

and the objective functional is 

fa hi (H) = 10fas{H) + 20fa ct _ eS {H). (3.8) 

In figure 3.7(a), the lines of constant objective (3.8) in the performance plane 
are shown. Along each of these constant objective lines, the tradeoff is exactly 
increasing (or decreasing) fa s by 10%, while decreasing (or increasing) (/> ac t_efr by 
0.05. The constant objective line fa^j = 1.91 is tangent to the tradeoff curve; the 
^obj -optimal specification is the intersection of this line with the tradeoff curve. 
This specification is a± = 10.78%, a 2 = 0.0415. 

In figure 3.7(b), the constant objective lines and the optimum specification are 
shown for the weights Ai = 20, A 2 = 10. These weights put more emphasis on 
the step response overshoot than do our original weights (3.8); the corresponding 
optimum specification is a\ = 4.86%, a 2 = 0.097, which represents a tradeoff of 
smaller overshoot for larger actuator effort over the optimum specification for the 
weights (3.8). 

In the general case, we consider hyperplanes in performance space with a normal 
vector A. Each such hyperplane consists of specifications that yield the same value 
of the objective functional fa^j given by (3.6). We can interpret the optimization 
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Figure 3.7 Lines of constant objective functional (3.6) are shown, together 
with the optimizing design. In (a) the weights are Ai = 10 and A2 = 20 and 
the smallest achievable objective functional value is 1.91. In (b) the weights 
are Ai = 20 and A2 = 10 and the smallest achievable objective functional 
value is 1.94. In both cases the smaller objective functional values shown 
lie below the performance boundary and are therefore not achievable. 



20 



problem (3.4) geometrically in performance space as follows: "go as far as possible 
in the direction —A while staying in the region of achievable specifications". More 
precisely, 



a C rit = mm 



< A a T>^ a ' is achievable > 



Figure 3.7 suggests another interpretation of the classical optimization problem 
with objective (3.6). If the tradeoff boundary is smooth near the optimal specifi- 
cation a opt , then a first order approximation of the tradeoff surface is given by the 
tangent hyperplane 



{a I X T (a 



z opt. 



= 0}. 



(3.9) 



We can use (3.9) to estimate nearby specifications on the tradeoff surface. For 
example, the approximate optimal tradeoff between two functionals (pi and <f)j (with 
all other specifications fixed) is given by 



Xi6(pi 



-\j6<t>j, 



where 6<fii represents the change in the functional (pi along the tradeoff surface. Since 
the weights are positive, 8<f>i and 8<f)j have different signs, meaning that we must 
give up performance in one functional (e.g., 6<fii > 0) to obtain better performance 
in the other (6<f)j < 0). For small changes, the ratio of the two changes is nearly 
the inverse ratio of the weights: 

8<f>j Aj 
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Thus, when the classical optimization paradigm is applied to the weighted-sum 
objective (3.6), the resulting specification is on the tradeoff surface, and the local 
tradeoffs between the different functionals are given by the inverse ratios of the 
corresponding weights. 

The solution of each classical optimization problem with the weighted-sum ob- 
jective (3.6) is a Pareto optimal specification for the multicriterion optimization 
problem with criteria <j>\,. . -,<J>l- However, there can be Pareto optimal specifica- 
tions that are not optimal for any selection of weights in the classical optimization 
problem with weighted-sum objective. Figure 3.8 shows an example of this. We will 
see in chapter 6 that in many cases the Pareto optimal specifications are exactly 
the same as the specifications that are optimal for some selection of weights in the 
classical optimization problem with weighted-sum objective. 
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Figure 3.8 An example of a tradeoff curve with a Pareto optimal spec- 
ification that is not optimal for any selection of weights in the classical 
optimization problem with weighted-sum objective. It is optimal for the 
weighted-max objective with weights Ai = A2 = 10. 



3.6.2 The Dual Function 

An important function defined in terms of the weighted-sum objective is the dual 
objective or dual function associated with the functionals <f>i,...,<f>L and the hard 
constraint X>hard- It is defined as the mapping from the nonnegative weight vector 
A 6 R, into the resulting minimum weighted-sum objective: 



•0(A) = min{Ai0i(iI) H \- Xl4>l{H) \ H satisfies 2>hard} 



(3.10) 
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We will encounter the dual function in chapter 6 and chapters 12-15. 



3.6.3 Weighted-Max Objective 

Another approach, sometimes called minimax design, is to form the objective func- 
tional as the maximum of the weighted functionals: 



0obj(-ff) = max{Ai0i(.ff), . . . , \ L <f>L{H)} , 



(3.11) 



where Aj are nonnegative. We refer to (3.11) as a weighted-max objective. As before, 
the weights are meant to express the designer's preferences among the functionals. 
In figure 3.9 the curves of constant weighted-max objective are shown for the 
same two sets of weights as figure 3.7. In figure 3.9(a) the weights are Ai = 10, 
A 2 = 20; the constant objective curve </> bj = 0.96 touches the tradeoff curve at 
a-i, = 9.6%, a 2 = 0.048. In figure 3.9(b) the weights are A x = 20, A 2 = 10; the 
constant objective curve ^> bj = 0.97 touches the tradeoff curve at a\ = 4.85%, 
a 2 = 0.097. 







.-■ 


0.16 




.■■'' .■■'' .■■'' .■■'■ 


0.12 




.■•'' .-■' .•'"' .••''' ■ 


0.08 




optimum 




0.96 \< 


"^^___^ 


0.04 


: 0T 




h 0.4 : 




! ! 



0.2 



0.16 



0.12 



0.08 



0.04 




4 8 12 16 20 4 8 12 

a\ (percent) a\ (percent) 

(a) ' (b) ' 

Figure 3.9 Lines of constant minimax objective functional (3.11) are 
shown, together with the optimizing design. In (a) the weights are Ai = 10 
and A2 = 20 and the smallest achievable objective functional value is 0.96. 
In (b) the weights are Ai = 20 and A2 = 10 and the smallest achievable 
objective functional value is 0.97. In both cases the smaller objective func- 
tional values shown lie below the performance boundary and are therefore 
not achievable. 
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This minimax approach always produces Pareto optimal specifications, and vice 
versa: every Pareto optimal specification arises as the solution of the minimax 
problem (3.11) for some choice of weights. For example, the Pareto optimal spec- 
ification shown in figure 3.8 is optimal for the minimax specification with weights 
A x = A 2 = 10. 
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In one variation on the weighted-max objective (3.11), a constant offset is sub- 
tracted from each functional: 

<t> obi {H) = max {Ai (</>!(#) - 7i), • • • , ^l{4>l{H) - 7l)} • (3.12) 

3.7 General Design Procedures 

In this section we describe some design procedures that have been suggested and 
successfully used. We suppose that <f>i,...,<f>L are our criteria. From our discussion 
of multicriterion optimization, we know that, among the achievable specifications, 
the Pareto optimal ones are wise choices. It remains to choose one of these, i.e., to 
"search" the tradeoff surface for a suitable design. This is generally done interac- 
tively, often by repeatedly adjusting the weights in a weighted-sum or weighted-max 
objective and evaluating the resulting optimal design. 

3.7.1 Initial Designs 

We have already mentioned one sensible set of initial weights for either the weighted- 
sum or weighted-max objectives: each Aj is the inverse of a nominal value or a 
maximum value for the functional <f>i. 

Several researchers have suggested the following method for determining reason- 
able initial weights and offsets for the offset weighted-max objective (3.12). The 
designer first decides what "good" and "bad" values would be for each objective; we 
will use d and B\ to denote these values. One choice is to let d be the minimum 
value of the functional <j>t alone. We then form the objective 

(pobj(H) =max| - _ G — ,..., - _ G j, (3.13) 

which is the maximum of the "normalized badness" of H: <f) \,j(H) » 1 means that 
at least one criterion is near a bad value; <f) bj(H) » 0.1 means that the worst 
criterion is only 10% of the way away from its good value, towards its bad value. 

Another approach, called goal programming, starts with some "goal" values G{ 
for the functionals <j>\,. .. ,<J>l- The designer then determines the closest achievable 
specification to this goal specification, using, e.g., a weighted Euclidean distance in 
performance space. 

3.7.2 Design Iterations 

Perhaps the most common method used to find a new and, one hopes, more suitable 
design is informal adjustment ("tweaking") of the weights in the weighted-sum or 
weighted-max objective for the classical optimization problem. The designer picks 
one criterion, say (f>i bSkd , whose current optimal value is deemed unacceptably large, 
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and then increases the associated weight Aj bad . One drawback of this method is 
that if more than one weight is adjusted in a single iteration, it can be difficult to 
predict the resulting effect on the optimal values of the criteria. 

Using the objective (3.13), a design iteration consists of a re-evaluation of what 
good and bad values are for each objective, based on the current and previous 
optimal values of the criteria. Similarly, in goal programming the designer can 
change the goal specification, or change the norm used to determine the closest 
achievable specification. 

A procedure called the satisficing tradeoff method uses the objective (3.13). In 
this method, the B\ are called aspiration levels, so the goal is to find a design 
with </>obj(#) ~ 1, if possible, which means that our aspirations have been met 
or exceeded. At each iteration, new aspiration levels are set according to various 
heuristic rules, e.g., they should be lower for those functionals that are currently 
too large, and they must be larger for at least one functional that the designer feels 
can be worsened a bit. 
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Notes and References 

The Feasibility Problem 

In [ZA73], Zakian and Al-Naib formulate the design problem as a set of functional in- 
equalities that must be satisfied: 

In the design of dynamical systems, such as control systems, electrical net- 
works and other analogous systems, it is convenient to formulate the design 
problem in terms of the inequalities 

<t>i{jp) <Ci, i = 1,2,..., m 

\p represents the design]. Some of the <f>i(p) are related to the dynamical 
behavior of the system . . . and are called functionals. 

If the d are fixed, then Zakian and Al-Naib's formulation is what we have called the 
feasibility problem; if, on the other hand, the C% are treated as parameters that the designer 
can vary, then the "wise" choices for the parameters are the Pareto optimal values. 

The Classical Optimal Controller Paradigm 

Wiener and Kolmogorov were the first to explicitly study the optimal controller paradigm. 
Linear controller design before that time had consisted mostly of rules for synthesizing 
controllers, e.g., PID tuning rules [ZN42] or root locus methods [Eva50], or analysis 
useful in designing controllers, e.g., Bode [Bod45]. In the 1957 book by Newton, Gould, 
and Kaiser [NGK57], the optimal controller paradigm is called the "analytical design 
procedure", which they describe as (p. 31): 

In place of a relatively simple statement of the allowable error, the analytical 
design procedure employs a more or less elaborate performance index. The 
objective of the performance index is to encompass in a single number a quality 
measure for the performance of the system. 

Multicriterion Optimization and Pareto Optimality 

The notion of Pareto optimality is first explicitly described in Pareto's 1896 treatise on 
Economics [Par.96]; since then it has been extensively studied in the Econometrics litera- 
ture, e.g., Von Neumann and Morgenstern [NM53] and Debreu [Deb59]. The latter book 
contains many plots exactly like our plots of achievable specifications, tradeoff curves, 
and so on. Two recent texts on multicriterion optimization are Sawaragi, Nakayama, and 
Tanino [SNT85] and Luc [Luc89]. [SNT85] covers many practical aspects of multicri- 
terion optimization and decision making; [Luc89] is a clear and complete description of 
vector optimization, covers topics such as convexity, quasiconvexity, and duality, and has 
an extensive bibliography. Brayton and Spence [BS80, Ch7] has a general discussion of 
multicriterion optimization similar to ours. 

The article [Wie82] describes satisficing decision making, gives a mathematically rigorous 
formulation, and has a large set of references. The procedure for selecting weights and 
offsets in (3.13) of section 3.7 is described in Nye and Tits [NT86], Fan et al. [FWK89], 
and also the book by Sawaragi, Nakayama, and Tanino [SNT85]. 

An early reference in the control literature to multicriterion optimization is the arti- 
cle [Zad63] by Zadeh, published right at the rise in prominence of the optimal controller 
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paradigm, which was spurred by the work of Kalman and others on the linear quadratic 
regulator (see chapter 12 and [AM90]). The recent book by Ng [Ng89] has a general 
discussion of multicriterion optimization for control system design, concentrating on the 
"quality of cooperation between the designer and his computer" . 

The Example Figures 

The plots of achievable specifications in performance space for our standard example are 
fictitious (in fact, we never specified the power spectral densities iSproc and 5 sens0 i, so our 
objective functionals are not even fully defined). Later in this book we will see many real 
tradeoff curves that relate to our standard example; see for example chapters 12 and 15. 
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Chapter 4 

Norms of Signals 



Many of the goals of controller design can be expressed in terms of the size of 
various signals, e.g., tracking error signals should be made "small", while the 
actuator signals should not be "too large" . In this chapter we explore some of the 
ways this notion of the size of a signal can be made precise, using norms, which 
generalize the notion of Euclidean length. 



4.1 Definition 

There are many ways to describe the size of a signal or to express the idea that a 
signal is small or large. For example, the fraction of time that the magnitude of a 
signal exceeds some given threshold can serve as a measure of the size of the signal; 
we could define "small" to mean that the threshold is exceeded less than 1% of the 
time. Among the many methods to measure the size of a signal, those that satisfy 
certain geometric properties have proven especially useful. These measures of size 
are called norms. 

The geometric properties that norms satisfy are expressed in the framework of 
a vector space; roughly speaking, we have a notion of how to add two signals, and 
how to multiply or scale a signal by a scalar (see the Notes and References). 

Definition 4.1: Suppose V is a vector space and <j) : V — * R + U {oo}. <j> is a norm 
on V if it satisfies 

Nonnegativity: <f>(v) > 0, 

Homogeneity: for (f)(v) < oo, (f)(av) = \a\(f)(v), 

Triangle inequality: <f)(v + w) < (f>(v) + (f)(w), 

for all a 6 R and v, w G V. 
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We warn the reader that this definition differs slightly from the standard definition 
of a norm; see the Notes and References at the end of this chapter. 
We will generally use the notation 

II II A At \ 

where x is some distinguishing mark or mnemonic for <f). This notation emphasizes 
that a norm is a generalization of the absolute value for real or complex numbers, 
and the Euclidean length of a vector. Note that we allow norms to take on the 
value +00, just as we allow our functionals on transfer matrices to do. We interpret 
||i;|| = oo as "v is infinitely large" as measured by the norm || • ||. 

In the next few sections we survey some common norms used to measure the 
size of signals. The verification that these norms do in fact satisfy the required 
properties is left as an exercise for the reader (alternatively, the reader can consult 
the references). 

4.2 Common Norms of Scalar Signals 

4.2.1 Peak 

One simple but strict interpretation of "the signal u is small" is that it is small at 
all times, or equivalently, its maximum or peak absolute value is small. The peak 
or Lqo norm of u is defined as 

ii ii A i fj.\\ 

||u||oo = SUp \U{t)\. 
t>0 

An example of a signal u and its peak ||w||oo is shown in figure 4.1. 

The peak norm of a signal is useful in specifying a strict limit on the absolute 
value of a signal, e.g., the output current of a power amplifier, or the tracking error 
in a disk drive head positioning system. 

The peak norm of a signal depends entirely on the extreme or large values the 
signal takes on. If the signal occasionally has large values, ||u||oo will be large; a 
statistician would say ||u||oo depends on outliers or "rare events" in the signal u. 
We shall soon see other norms that depend to a lesser extent on occasional large 
signal values. 

It is useful to imagine how various signal norms might be measured. A full wave 
rectifier circuit that measures the peak of a voltage signal is shown in figure 4.2. 

The peak norm can be used to describe a signal about which very little is known, 
or willing to be assumed, other than some bound on its peak or worst case value. 
Such a description is called an unknown-but-bounded model of a signal: we assume 
only ||u||oo < M. An example is quantization error, the difference between a signal 
and its uniformly quantized value. This error can be modeled as unknown but 
bounded by one-half of the quantization interval. 
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9 10 



Figure 4.1 A signal u and its peak norm ||« 




Figure 4.2 With ideal diodes and an ideal capacitor the voltage on the 
capacitor, V c , tends to ||«||oo for t large. 
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A variation on the peak norm is the eventual peak or steady-state peak: 
IMIssoo = limsup \u(t)\ = lim sup|u(t)|. 

t-KX> T ^°°t>T 

The steady-state peak norm measures only persistent large excursions of the signal; 
unlike the peak norm, it is unaffected by transients, i.e., the addition of a signal 
that decays to zero: 



\\U ■+" ^transient 






if lim u transien t(t) = 0. 

t—*00 



4.2.2 Root-Mean-Square 

A measure of a signal that reflects its eventual, average size is its root-mean-square 
(RMS) value, defined by 



\u\\ 



± lim 



T^oo T 



flr^" 



1/2 



dt 



(4.1) 



provided the limit exists (see the Notes and References). This is a classical notion 
of the size of a signal, widely used in many areas of engineering. An example of a 
signal u and its RMS value ||u|| rms is shown in figure 4.3. 




(a) 
Figure 4.3 A signal u and its RMS value ||u|| 
is the average area under u 2 , as shown in (b). 



t 

(b) 
is shown in (a). \\u\\ 2 n 



In an early RMS ammeter, shown in figure 4.4, the torque on the rotor is propor- 
tional to the square of the current; its large rotational inertia, the torsional spring, 
and some damping, make the rotor deflection approximately proportional to the 
mean-square current, ||u||;? ms . 

Another useful conceptual model for ||u|| r ms is in terms of the average power dis- 
sipated in a resistive load driven by a voltage u, as in figure 4.5. The instantaneous 
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u(t) 




Figure 4.4 An early RMS ammeter consists of a stator L\ and a rotor 
L2, which is fitted with a needle and restoring spring. The deflection of the 
needle will be approximately proportional to ||u|| rms , the square of the RMS 
value of the input current u. 



power dissipated in the load resistor is simply u(t) 2 /R; the (long term) temperature 
rise of the large thermal mass above ambient temperature is proportional to the av- 
erage power dissipation in the load, ||w||r ms /.R. This conceptual model shows that 
the RMS measure is useful in specifying signal limits that are due to steady-state 
thermal considerations such as maximum power dissipation and temperature rise. 
For example, the current through a voice coil actuator might be limited by a max- 
imum allowable steady-state temperature rise for the voice coil; this specification 
could be expressed as an RMS limit on the voice coil current. 




ambient 
temperature T am b 



large thermal mass 
temperature T 



Figure 4.5 If u varies much faster than the thermal time constant of the 
mass, then the long term temperature rise of the mass is proportional to 
the average power in u, i.e. T — T am b oc ||«||? ms , where T am b is the ambient 
temperature, and T is the temperature of the mass. 



Even if the RMS norm of a signal is small, the signal may occasionally have 
large peaks, provided the peaks are not too frequent and do not contain too much 
energy. In this sense, ||u|| r ms is less affected than ||w||oo by large but infrequent 
values of the signal. We also note that the RMS norm is a steady-state measure of 
a signal; the RMS value of a signal is not affected by any transient. In particular, 
a signal with small RMS value can be very large for some initial time period. 



74 



Chapter 4 Norms of Signals 



4.2.3 Average- Absolute Value 

A measure that puts even less emphasis on large values of a signal (indeed, the 
minimum emphasis possible to still be a norm) is its average-absolute value, defined 
by 



\u\\ 



1 f T 
= J im 7f / \u{t)\dt, 



(4.2) 



provided the limit exists (see the Notes and References). An example of a signal u 
and its average-absolute norm ||u|| a a is shown in figure 4.6. ||u|| a a can be measured 
with the circuit shown in figure 4.7 (c.f. the peak detector circuit in figure 4.2). 




(a) 



(b) 



Figure 4.6 A signal u and its average-absolute value ||«||aa is shown in (a). 
||u|| aa is found by finding the average area under |u|, as shown in (b). 




Figure 4.7 If u varies much faster than the time constant RC then V c will 
be nearly proportional to the average of the peak of the input voltage u, so 
that V c = ||w||aa- The resistor r -C R ensures that the output impedance 
of the bridge is low at all times. 



The aver age- absolute norm ||u|| a a is useful in measuring average fuel or resource 
use, when the fuel or resource consumption is proportional to |u(t)|. In contrast, the 
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RMS norm ||u||r ms i s useful in measuring average power, which is often proportional 
to u(t) 2 . Examples of resource usage that might be measured with the average- 
absolute norm are rocket fuel use, compressed air use, or power supply demand in 
a conventional class B amplifier, as shown in figure 4.8. 





X 




Figure 4.8 Idealized version of a class B power amplifier, with no bias 
circuitry shown. Provided the amplifier does not clip, i.e., ||w||oo < V cc , the 
average power supplied by the power supply is proportional to ||«||aa, the 
average-absolute norm of u; the average power dissipated in the load R is 
proportional to ||w||? ms , the square of the RMS norm of u. 



4.2.4 Norms of Stochastic Signals 

For a signal modeled as a stationary stochastic process, the measure of its size most 
often used is 



IMIrms = (E«(t) 2 ) 



2\l/2 



(4.3) 



Because the process is stationary, the expression in (4.3) does not depend on t. For 
stochastic signals that approach stationarity as time goes on, we define 



\u\\ 



= ( lim Eu(t) 2 ) 

\t— >oo / 



1/2 



If the signal u is ergodic, then its RMS norm can be computed either by (4.3) 
or (4.1): with probability one the deterministic and stochastic RMS norms are 
equal. 

The RMS norm can be expressed in terms of the autocorrelation of u, 



R u {t) =Eu{t)u{t + T), 
or its power spectral density, 

/oo 
R u (r)e-^ dr, 
-OO 



(4.4) 
(4.5) 
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as follows: 



Hull 2 

l"llrms 



i r°° 

= R U {0) = — S u {Lo)dw. (4.6) 

27r J-oo 



We can interpret the last integral as follows: the average power in the signal is the 
integral of the contribution at each frequency. 

For stochastic signals, the analogs of the average-absolute or peak norm are less 
often encountered than the RMS norm. For u stationary, we define the average- 
absolute norm as 

IHIaa = E|«(*)|, 

which for ergodic signals agrees (with probability one) with our deterministic defi- 
nition of ||u||aa- We interpret ||u|| a a as the expected or mean resource consumption. 
The analog of the steady-state peak of u is the essential sup norm, 

IMIessjup = inf {a | Prob(|u(i)| > a) = 0} , 

or equivalently, the smallest number a such that with probability one, \u(t)\ < a. 
Under some mild technical assumptions about u, this agrees with probability one 
with the steady-state peak norm of u defined in section 4.2.1. 

4.2.5 Amplitude Distributions 

We can think of the steady-state peak norm, RMS norm, and average-absolute 
norm as differing in the relative weighting of large versus small signal values: the 
steady-state peak norm is entirely dependent on the large values of a signal; the 
RMS norm is less dependent on the large values, and the average-absolute norm 
less still. 

This idea can be made precise by considering the notion of the amplitude distri- 
bution F u (a) of a signal u, which is, roughly speaking, the fraction of the time the 
signal exceeds the limit a, or the probability that the signal exceeds the limit a at 
some particular time. 

We first consider stationary ergodic stochastic signals. The amplitude distribu- 
tion is just the probability distribution of the absolute value of the signal: 

P u (a) = Prob(|«(t)| > a). 

Since u is stationary, this expression does not depend on t. 

We can also express F u (a) in terms of the fraction of time the absolute value of 
the signal exceeds the threshold a. Consider the time interval [0, T]. Over this time 
interval, the signal u will spend some fraction of the total time T with \u(t)\ > a. 
F u (a) is the limit of this fraction as T — ► oo: 

f„(«) = lim rt«lo<«<r,| u (0|>« }> (47) 

T— »oo _L 
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where /i(-) denotes the total length (Lebesgue measure) of a subset of the real line. 
These two ideas are depicted in figure 4.9. The amplitude distribution of the signal 
in figure 4.9 is shown in figure 4.10. 




Figure 4.9 Example of calculating F u (1.5) for the signal u in figure 4.1. 
For T = 10, n{t | < t < T, \u(t)\ > 1.5} is the length of the shaded 
intervals, and this length divided by T approximates F u (1.5). 

This last interpretation of the amplitude distribution in terms of the fraction of 
time the signal exceeds any given threshold allows us to extend the notion of ampli- 
tude distribution to some deterministic (non-stochastic) signals. For a deterministic 
u, we define F u (a) to be the limit (4.7), provided this limit exists (it need not). All 
of the results of this section hold for a suitably restricted set of deterministic signals, 
if we use this definition of amplitude distribution. There are many more technical 
details in such a treatment of deterministic signals, however, so we continue under 
the assumption that u is a stationary ergodic stochastic process. 

Clearly F u (a) = for a > ||u|| ssoo , and F u (a) increases to one as a decreases 
to zero. Informally, we think of \u(t)\ as spending a large fraction of time where 
the slope of F u (a) is sharp; if F u decreases approximately linearly, we say \u(t)\ is 
approximately uniformly distributed in amplitude. Figure 4.11 shows two signals 
and their amplitude distribution functions. 

We can compute the steady-state peak, RMS, and average-absolute norms of u 
directly from its amplitude distribution. We have already seen that 

IMIssoc = sup{a | F u (a) > 0}. (4.8) 

The steady-state peak of a signal is therefore the value of a at which the graph of 
the amplitude distribution first becomes zero. 
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Figure 4.10 The amplitude distribution F„ of the signal u shown in fig- 
ure 4.1, together with the values ||w||aa, ||w||rms, and ||w||oo- 



Prom elementary probability theory we have 

/>oo 

||u|| aa = E|u(t)| = / F u (a)da. 
Jo 



(4.9) 



Thus, the average-absolute norm of a signal is the total area under the amplitude 
distribution function. 

Since the amplitude distribution function of u 2 is F u 2(a) = F u (t/o~), equa- 
tion (4.9) yields 

/•OO y»CO y»CO 

Eu(t) 2 = / F u 2(a)da= / F u (*/a)da= / 2aF u (a) da, 
Jo Jo Jo 

so that we can express the RMS norm as: 

y»00 

= / 2aF u (a)da. 
Jo 



\u\\ 



(4.10) 



Thus, the average power in the signal is the integral of its amplitude distribution 
function times 2a. Just as we interpret formula (4.6) as expressing the average 
power in the signal as the integral of the contributions at all frequencies, we may 
interpret (4.10) as expressing the average power in the signal as the integral of the 
contributions from all possible signal amplitudes. 

Comparison of the three formulas (4.8), (4.9), and (4.10) show that the three 
norms simply put different emphasis on large and small signal values: the steady- 
state peak norm puts all of its emphasis on large values; the RMS norm puts 
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(c) (d) 

Figure 4.11 Examples of periodic signals are shown in (a) and (c). Their 
respective amplitude distribution functions are shown in (b) and (d). The 
signal in (a) spends most of its time near its peaks; the amplitude distribu- 
tion falls rapidly near a = ||mi||oo- The signal in (c) spends most of its time 
near 0; the amplitude distribution falls rapidly near a = 0. 
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linearly weighted emphasis on signal amplitudes; and the average-absolute norm 
puts uniform weight on all signal amplitudes. 



4.2.6 L2: Square Root Total Energy 

The previous sections dealt with the sizes of signals that persist. In this section 
and the next we examine some norms appropriate for transient signals, which decay 
to zero as time progresses; such signals have zero as their steady-state peak, RMS, 
and aver age- absolute norms. 

The total energy or L2 norm of a signal is defined by 



\\ U \\2 



hi: 



u(t) 2 dt 



1/2 



This norm is the appropriate analog of the RMS norm for decaying signals, i.e., 
signals with finite total energy as opposed to finite steady-state power. 

A useful conceptual model for the L2 norm is shown in figure 4.12. Here we 
think of u as a driving voltage to a resistive load immersed in a thermal mass. 
This thermal mass is isolated (adiabatic), unlike the mass of figure 4.5, which is 
connected to the ambient temperature (a very large thermal mass) through a finite 
thermal resistance. The eventual temperature rise of the isolated thermal mass is 
proportional to ||u|||. 




good insulation 



large thermal mass 
temperature T 

Figure 4.12 The long term temperature rise of the mass is proportional 
to the total energy in u, i.e. T — Ti n i t oc ||m|||, where Ti n i t is the initial 
temperature of the mass, and T is the final temperature of the mass. 



As a practical example, suppose that u represents the current through a voice 
coil drive during a step input in commanded position, and the thermal time constant 
of the voice coil is longer than the time over which u is large. Then ||u||| is a measure 
of the temperature rise in the voice coil during a step command input. 

By Parseval's theorem, the L 2 norm can be computed as an L 2 norm in the 
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frequency domain: 



\u\ 



= (££W*) 



1/2 



(4.11) 



4.2.7 Li: Total Fuel or Resource Consumption 

The Li norm of a signal is denned as 

\u(t)\dt. 



\u\\ 



-I 

Jo 



Just as the L2 norm measures the total energy in a signal, while the RMS norm 
measures its average power, the Li norm of a signal can be thought of as measuring 
a total resource consumption, while the average-absolute norm measures a steady- 
state average resource consumption. For example, if u represents the compressed 
gas flow through a nozzle during a particular spacecraft maneuver, then ||u||i is 
proportional to the total gas consumed during the maneuver. 

4.2.8 Frequency Domain Weights 

The norms described above can be combined with an initial linear transformation 
that serves to emphasize or de-emphasize certain aspects of a signal. Typically, 
this initial transformation consists of passing the signal through an LTI filter with 
transfer function W, which we refer to as a frequency domain weight, as shown in 
figure 4.13. 



W 



\\Wu\\ 



Figure 4.13 A frequency domain weighted norm is computed by passing 
the signal u through an LTI filter W, and then determining the (unweighted) 
norm of this filtered signal. 

The idea is that the weighting filter makes the norm more "sensitive" (i.e., assign 
larger values) to signals that have a large power spectral density at those frequencies 
where |W(jw)| is large. This idea can be made precise for the PF-weighted RMS 
norm, which we will denote || • ||w,rms- The power spectral density of the filtered 
signal Wu is 

S Wu (w)=S u {w)\W(jw)\ 2 , 



so the RMS norm of Wu is 

\\U w Ims 



= \\Wu\\ Iins = (J^J°° S u (w)\W(jw)\ 2 dw 



1/2 



(4.12) 
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Thus, the weight emphasizes the power spectral density of u where |W(jw)| is large, 
meaning it contributes more to the integral, and de-emphasizes the power spectral 
density of u where |PF(jw)| is small. We note from (4.12) that the weighted norm 
depends only on the magnitude of the weighting transfer function W, and not its 
phase. This is not true of all weighted norms. 

We can view the effect of the weight W as changing the relative importance of 
different frequencies in the total power integral (4.12). We can also interpret the 
W- weighted RMS norm in terms of the average power conceptual model shown in 
figure 4.5, by changing the resistive load R to a frequency-dependent load, i.e., a 
more general passive admittance G(s). The load admittance G is related to the 
weight W by the spectral factorization 

G{s) + G{-s) 



= W(s)W(-s) 



(4.13) 



so that $lG(jw) = \W(jtv)\ 2 . Since the average power dissipated in G at a frequency 
w is proportional to the real part (resistive component) of G(jtv), we see that the 
total average power dissipated in G is given by the square of (4.12), or the square 
of the W- weighted RMS norm of u. 

For example, suppose that W(s) = (1 + V2s)/(1 + s), which gives up to 3dB 
emphasis at frequencies above V2. The load admittance for this weight is G(s) = 
(1 + 2s)/(l + s), which we realize as the parallel connection of a lft resistor and the 
series connection of a lft resistor and a IF capacitor, as shown in figure 4.14. 



(«m) 




Figure 4.14 The average power dissipated in the termination admittance 
G(s) = (1 + 2s)/(l + s) is ||«||w,rms) the square of the W-weighted RMS 
norm of the driving voltage u, where W(s) = (1 + V2s)/(1 + s). 



Frequency domain weights are used less often with the other norms, mostly 
because their effect is harder to understand than the simple formula (4.12). One 
common exception is the maximum slew rate, which is the peak norm used with the 
weight W(s) = s, a differentiator: 



^ slew_rate — n* 



(4.14) 



Maximum slew-rate specifications occur frequently, especially on actuator signals. 
For example, an actuator signal may represent the position of a large valve, which 
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is opened and closed by a motor that has a maximum speed. A graphical interpre- 
tation of the slew-rate constraint ||w|| s iew_rate < 1 is shown in figure 4.15. 
The peak norm is sometimes used with higher order differentiators: 



7/ — 



IMIjerk = 



d 2 U 



dt 2 
d 3 u 



dt 3 



Weights that are successively higher order differentiators yield the amusing snap, 
crackle, and pop norms. 
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Figure 4.15 An interpretation of the maximum slew-rate specification 
||«|| < 1: at every time t the graph of u must evolve within a cone, whose 
sides have slopes of ±1. Examples of these cones are shown for t = 0, 3, 6; 
u is slew-rate limited at t = and t = 6, since «(0) = 1 and «(6) = — 1. 



4.2.9 Time Domain Weights 

If the initial linear transformation consists of multiplying the signal by some given 
function of time w(t), we refer to w as a time domain weight. One example is the 
ITAE (integral of time multiplied by absolute error) norm from classical control, 
defined as 



\\u\ 



/»oo 

litae = / t\u(t)\dt. 
JO 
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This is simply the Li norm of u with the time domain weight w(t) = t, which serves 
to emphasize the signal u at large times, and de-emphasize the signal at small times. 

One commonly used family of time domain weights is the family of exponential 
weightings, which have the form w(t) = exp(at). If a is positive, such a weighting 
exponentially emphasizes the signal at large times. This may be appropriate for 
measuring the size of a decaying signal. Alternatively, we can think of a specification 
such as Halloo < M (where u(t) = exp(at)u(t), and a > 0) as enforcing a rate of 
decay in the signal u at least as fast as exp( — at). An example of a signal u and the 
exponentially scaled signal u(t)e _< is shown in figure 4.16. 
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(a) (b) 

Figure 4.16 A signal u is shown in (a). The exponentially weighted L2 
norm of u is the L2 norm of the signal u(t) = «(t)e _t , which is shown in 
(b). 



If a is negative, then the signal is exponentially de-emphasized at large times. 
This might be useful to measure the size of a diverging or growing signal, where the 
value of an unweighted norm is infinite. 

There is a simple frequency domain interpretation of the exponentially weighted 
L2 norms. If the a-exponentially weighted L2 norm of a signal u is finite then its 
Laplace transform U(s) is analytic in the region {s | Sfts > —a}, and in fact 



,~.l|2 



\\U 



/tOO -1 ytOO 

= / (exp(at)u(t)) 2 dt = — / \U{-a + jw)\ 2 dw (4.15) 

Jo 27r J-00 



(recall u(t) = exp(at)u(t)). The only difference between (4.15) and (4.11) is that the 
integral in the a-exponentially weighted norm is shifted by the weight exponent a. 
The frequency domain calculations of the L2 norm and the a-exponentially weighted 
L2 norm for the signal in figure 4.16(a) are shown in figure 4.17. 
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\U(tr+jw)\ 




(a) 



\U(a + jw)f 



(7 = 




(7=1 



(b) 



Figure 4.17 An exponentially weighted L2 norm can be calculated in the 
frequency domain. Consider ||«||2,-i = ||-tt || 2 , where u(t) = e~ l u{t) is shown 
in figure 4.16(b). \U(cr + ju)\, the magnitude of the Laplace transform of u, 
is shown in (a) for a > 0. As shown in (b), ||«||2 is proportional to the area 
under |C7(j'ctj)| , and ||w||2,-i is proportional to the area under ^(l+jo;)! 2 . 
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4.3 Common Norms of Vector Signals 

The norms for scalar signals described in section 4.2 have natural extensions to 
vector signals. We now suppose that u : R + — ► R n , i.e., u(t) € R n for t > 0. 

4.3.1 Peak 

The peak, or Loo norm, of a vector signal is defined to be the maximum peak of 
any of its components: 

ii ii A ii ii i fj.\\ 

||u||oo = max ||Ui||oo = sup max |uj(i)| 

l<i<n t>0 l<^<n 

(note the different uses of the notation || • ||oo)- Thus, "||u||oo is small" means 
that every component of u is always small. A two-input peak detector is shown in 
figure 4.18. 



(M*j) 



+ 

(u 2 {t)) 
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Figure 4.18 With ideal diodes and an ideal capacitor, the voltage on the 
capacitor, V c , tends to ||«||oo for t large, where u = [u\ «2] T is a vector of 
two signals. 



4.3.2 RMS 

We define the RMS norm of a vector signal as 



l|«||rms = lim 

T-.00T 



u 



T \ X / 2 

u(t) T u(t) dt 



(4.16) 



provided this limit exists (see the Notes and References). For an ergodic wide-sense 
stationary stochastic signal this can be expressed 



\u\\ 



1 f°° 

= E\\u{t)\\ 2 2 =TrR u {0)=Tr— S u {u)dw. 

27r ./-oo 



4.3 Common Norms of Vector Signals 

(For vector signals, the autocorrelation is defined by 

Ruir) =Eu(t)u(t + T) T 
(c.f. (4.4))). For such signals we have 

/ n \ V2 
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= Ei 



(4.17) 



i.e., ||u|| rms is the square root of the sum of the mean-square values of the compo- 
nents of u. 

A conceptual model for the RMS value of a vector voltage signal is shown in 
figure 4.19. 




ambient 
temperature T am b 



large thermal mass 
temperature T 



Figure 4.19 If u\ and «2 vary much faster than the thermal time constant 
of the mass, then the long term temperature rise of the mass is proportional 
to the average power in the vector signal u, i.e. T — T am b oc ||«||? ms , where 
Tkmb is the ambient temperature, and T is the temperature of the mass. 



4.3.3 Average-Absolute 

The average-absolute norm of a vector signal is defined by 



1 pi "■ 

||u|| aa =limsup— / y2\ui(i)\dt. 
t^oo J- Jo fr( 

This measures the average total resource consumption of all the components of u. 
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4.3.4 L 2 and Li Norms 

The I12 and Li norms of a vector signal are denned by 

A / f oo n \ 1/2 / „ X 1/2 

\\ u h = ( / ^M*) 

114= EM*)i dt = E 

- 70 i=l i=l 

4.3.5 Scaling and Weighting 

A very important concept associated with vector signal norms is scaling, which can 
be thought of as assigning relative weights to the different components of the vector 
signal. For example, suppose that U\ represents the output voltage and u 2 the 
output current of a power amplifier that voltage saturates (clips) at ±100V and 
current limits at ±2A. One appropriate measure of the peak of this signal is 



\Ui h. 



\\ U \\D,oo 

where 

D = 



sup max < , > = \\Du\ 

t>l 1 100 ' 2 J " ' 



1/100 
1/2 



D is referred to as a scaling matrix, and ||u||d,oo the D-scaled peak of u. We can 
interpret ||u||d,oo as the size of u, relative to amplifier voltage or current overload: 
|| u I|d,oo = 0.5 indicates 6dB of headroom before the amplifier overloads; ||u||d )0 o = 
1.0 indicates that the amplifier will just saturate or current limit. 

When the different components of a vector signal u represent different physical 
quantities, as in the example above, the use of an appropriate scaling matrix is 
crucial. It is useful to think of the scaling matrix as including the translation 
factors among the various physical units of the components of u: for our example 
above, 



D = 



1/100V 

1/2A 



which properly renders ||u||d )0 o unitless. 

A simple rule-of-thumb is to use scale factors that are inversely proportional to 
what we might consider typical, nominal, or maximum acceptable values for that 
component of the signal. The scaling in our example above is based on this principle, 
using the maximum acceptable (overload) values. We have already encountered this 
idea in sections 3.6.1 and 3.6.3. 
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For average-absolute or Li norms, the scale factors can be interpreted as rel- 
ative costs or values of the resources or commodities represented by the different 
components of the signal. For RMS or L 2 norms, the scale factors might represent 
different termination resistances in the conceptual model in figure 4.19. 

Scaling is a very special form of weighting, since it consists of applying a linear 
transformation to the signal before computing its norm; the linear transformation 
is just multiplication by a diagonal matrix. Of course, it is also possible to multiply 
the signal vector by a nondiagonal matrix before computing the norm, as in 

IMU.2 = \\Au\\ 2 , 
where A is some matrix. One familiar example of this is the weighted L 2 norm, 

1/2 



l|«IU,2 = ( / u(t) T Ru(t) dtj 



where R = A T A. A is called a (constant) weight matrix. Constant weight matri- 
ces can be used to emphasize some directions in R n while de-emphasizing others, 
whereas with (diagonal) scaling matrices we are restricted to directions aligned with 
the axes. An example of this distinction is shown in figure 4.20 for the constraint 

||«IU,°° = II^Hloo < i- 

When A is diagonal the signal u(t) is constrained to lie in a rectangle at each time 
t. For a general matrix A, the signal u(t) is constrained to lie in a trapezoid. 

More generally, we can preprocess the signal by a weighting transfer matrix W 
that has n columns, as in 

||«||w,rms = ||W«||rma- 

Often, W is square, i.e., it has n rows as well. W might emphasize different "direc- 
tions" at different frequencies. 

4.4 Comparing Norms 

We have seen many norms for signals. A natural question is: how different can they 
be? Intuition suggests that since these different norms each measure the "size" of 
a signal, they should generally agree about whether a signal is "small" or "large". 
This intuition is generally false, however. 
For scalar signals we have 

IMIoo > IMIrms > IMIaa! (4.18) 

for vector signals with n components we have the generalization 

■ „ 1 „ M 1„ M / 

Hoc > ~7= « rms > - \\U aa- (4.19) 

Vn n 
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Figure 4.20 The constraint ||«||a,oo < 1 requires that the signal u(t) lie 
inside a trapezoid at all times t. Three weighting matrices A are shown, 
together with the corresponding limits on the signal u. 

The first inequality in (4.18) can be shown by replacing u 2 by the upper bound 

|2 . 



\U\\ 



T-oo T J 

< lim — / 



u(<) 2 dt 



\uWLdt 






The second inequality in (4.18) follows from the Cauchy-Schwarz inequality: 
Hull-- = lim — / l|u(i)|dt 



= lim I I l|«(*)l' 

/ T \ 1//2 / T 



1/2 



< lim , 



It can also be shown that 



U^ 71 *> 71 "^7/ 

, ||oo _ H^llssoo _ H^llrms _ || ""Haa* 



dt 



Another norm inequality, that gives a lower bound for ||u|| aa , is 



lull™. < llu||.»llu 



aa " oo ■ 



(4.20) 
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This inequality can be understood by considering the power amplifier shown in 
figure 4.8, with a lft load resistance. If we have V cc = || u ||oo> then the amplifier 
does not saturate (clip), and the voltage on the load is the input voltage, u(t). The 
average power delivered to the load is therefore ||u||;? ms . The average power supply 
current is ||u|| a a> so the average power delivered by the power supply is ||u||aaVc C , 
which is ||u|| aa ||u||oo- Of course, the average power delivered to the load does not 
exceed the average power drain on the power supply (the difference is dissipated in 
the transistors), so we have ||w||;? ms — N u l|aa|M|oo> which is (4.20). 

If the signal u is close to a switching signal, which means that it spends most of 
its time near its peak value, then the values of the peak, steady-state peak, RMS 
and average-absolute norms will be close. The crest factor of a signal gives an 
indication of how much time a signal spends near its peak value. The crest factor 
is defined as the ratio of the steady-state peak value to the RMS value of a signal: 

/-IT-!/ "\ ^ SSOO 

CF(u) = j— j . 



Since ||-u|| ssoo > ||u|| r ms> the crest factor of a signal is at least 1. The crest factor is 
a measure of how rapidly the amplitude distribution of the signal increases below 
a = \\u\\ ssoo ; see figure 4.11. The two signals in figure 4.11 have crest factors of 1.15 
and 5.66, respectively. 

The crest factor can be combined with the upper bound in (4.20) to give a bound 
on how much the RMS norm exceeds the average- absolute norm: 



\u\ 
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Notes and References 

For general references on vector spaces, norms of signals, or norms in general, see the 
Notes and References for the next chapter. The text [WH85] by Wong and Hajek covers 
stochastic signals. 

The bold L in the symbols Li, L2, and Loo stands for the mathematician H. Lebesgue. 

Our Definition of Norm 

Our definition 4.1 differs from the standard definition of a norm in two ways: first, we 
allow norms to take on the value +00; and second, we do not require ||w|| > for nonzero 
v (which is called the definiteness property). The standard term for what we call a norm 
is seminorm. We will not need the finiteness or definiteness properties of norms; in fact, 
the only property of norms that we use in this book is convexity (see chapter 6). Our less 
formal usage of the term norm allows us to give a less technical discussion of norms of 
signals. 

The mathematically sophisticated reader can form a standard norm from each of our 
seminorms by first restricting it to the subspace of all signals for which ||«|| is finite, and 
then forming the quotient space, modulo the subspace of all signals for which ||«|| is zero. 
This process is discussed in any mathematics text covering norms, e.g., Kolmogorov and 
Fomin [KF75] and Aubin [Aub79]. As an example, || • || SSO o is a standard norm on the 
vector space of equivalence classes of eventually bounded signals, where the equivalence 
classes consist of signals that differ by a transient, i.e., signals that converge to each other 
as t — ► 00. 

Some Mathematical Notes 

There are signals for which the RMS value (4.1) or average-absolute value (4.2) are not de- 
fined because the limits in these expressions fail to exist: for example, u(t) = coslog(l + t). 
These norms will always be defined if we substitute limsup for lim in the definitions (4.1) 
and (4.2). With this generalized definition of the RMS and average-absolute norm, many 
but not all of the properties discussed in this chapter still hold. For example, with lim sup 
substituted for lim in the definition of the RMS norm of a vector signal (given in (4.16)), 
equation (4.17) need not hold. 

We also note that the integral defining the power spectral density (equation (4.5)) need 
not exist. In this case the process has a spectral measure. 

Spectral Factorization of Weights 

Youla [You61] developed a spectral factorization analogous to (4.13) for transfer matri- 
ces, which yields an interpretation of the W- weighted RMS norm as the square root of the 
total average power dissipated in a passive n-port admittance G(s). In [And67], Ander- 
son showed how this spectral factorization for transfer matrices can be computed using 
state-space methods, by solving an algebraic Riccati equation. See also section 7.3 of 
Francis [Fra87]. 



Chapter 5 

Norms of Systems 



A notion closely related to the size of a signal is the size of a transfer function or 
LTI system. In this chapter we explore some of the ways this notion of size of a 
system can be made precise. 



5.1 Paradigms for System Norms 

In this chapter we discuss methods of measuring the size of an LTI system with input 
w, output z, and transfer matrix H, shown in figure 5.1. Many commonly used 
norms for LTI systems can be interpreted as examples of several general methods 
for measuring the size of a system in terms of the norms of its input and output 
signals. In the next few subsections we describe these general methods. 



w 



H 



Figure 5.1 A linear system with input w and output z. 

We leave as an exercise for the reader the verification that the norms we will 
encounter in this chapter do satisfy the required properties (i.e., definition 4.1). 

5.1.1 Norm of a Particular Response 

The simplest general method for measuring the size of a system is to measure the 
size of its response to a particular input signal w paTt , e.g., a unit impulse, a unit 
step, or a stochastic signal with a particular power spectral density. If we use the 
norm || • || ou tput to measure the size of the response, as shown in figure 5.2, we define 

IIFTII — II W II 

1 1 -" 1 1 part — ||-^ ^part || output • 
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(It can be shown that this functional satisfies all of the properties required for a 
norm, as our notation suggests.) 



w 



part 



H 



I output 



1 2 1 1 output 



Figure 5.2 The size of a transfer matrix H can be measured by applying 
a particular signal tu par t, and measuring the size of the output with some 
suitable signal norm || • || out put- 



5.1.2 Average Response Norm 

A general method for measuring the size of a system, that directly takes into account 
the response of the system to many input signals (and not just one particular input 
signal), is to measure the average size of the response of H to a specific probability 
distribution of input signals. If || • || output measures the size of the response, we 
define 

\\H avg = E \\Hw output) 

w 

where E w denotes expectation with respect to the distribution of input signals. 

5.1.3 Worst Case Response Norm 

Another general method for measuring the size of a system, that takes into account 
the response of the system to many input signals, is to measure the worst case or 
largest norm of the response of H to a specific collection of input signals. If || • || ou tput 
measures the size of the response, we define 

||-ff"|| wc = sup ||i?iy||out P ut, 
wew 

where W denotes the collection of input signals. 

5.1.4 Gain of a System 

An important special case of a worst case norm is a gain, defined as the largest ratio 
of the norm of the output to the norm of the input. If || • || is used to measure the 
size of both the input and output signals, we define 

A ll-Hw'll 

||#|| gn = sup ■ (5.1) 
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The gain \\H || gn is therefore the maximum factor by which the system can scale the 
size (measured by the norm || • ||) of a signal flowing through it. The gain can also 
be expressed as a worst case response norm: 

||-ff|| gn = SUp ||-Htu||. 
II»II<1 

If the transfer matrix H is not square, we cannot really use the same norm to 
measure the input and output signals, since they have different numbers of com- 
ponents. In such cases we rely on our naming conventions to identify the "same" 
norm to be used for the input and the output. For example, the RMS gain of a 
2x3 transfer matrix is defined by (5.1), where the norm in the numerator is the 
RMS norm of a vector signal with 2 components, and the norm in the denominator 
is the RMS norm of a vector signal with 3 components. It is also possible to define 
a more general gain with different types of norms on the input and output, but we 
will not use this generalization. 

5.2 Norms of SISO LTI Systems 

In this section we describe various norms for single-input, single-output (SISO) 
systems. 

5.2.1 Peak-Step 

Our first example of a norm of a system is from the first paradigm: the size of 
its response to a particular input. The particular input signal is a unit step; we 
measure the size of the response by its peak norm. We define: 

llwll - II II 

II llpk_step — ||^||oo 

where s(t) denotes the step response of H; we will refer to ||.ff || p kjtep as the peak-step 
norm of H. This would be an appropriate measure if, say, w represents a set-point 
command in a control system (a signal that might be expected to change values 
only occasionally), and z represents some actuator signal, say, a motor voltage. In 
this case, ||.ff || p kjtep (multiplied by the maximum possible changes in the set-point 
command) would represent a good approximation of the maximum motor voltage 
(due to set-point changes) that might be encountered in the operation of the system. 

5.2.2 RMS Response to a Particular Noise Input 

A common measure of the size of a transfer function is the RMS value of its output 
when its input is some particular stationary stochastic process. Suppose that the 
particular input w has power spectral density S w (w), and H is stable, meaning that 
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all of its poles have negative real part. The power spectral density of the output z 
of H is then 

S z {w) = S w {w)\H{jw)\\ 

and therefore 



-{hi: 



1/2 

\H{jw)\ 2 S w {w)dw) . (5.2) 



Thus we assign to H the norm 

/ 1 f°° \ 1//2 

\\H\\ Iias , w = (— I \H{jw)\ 2 S w {w)dw) . (5.3) 

The right-hand side of (5.3) has the same form as (4.12), with H substituted 
for W. The interpretations are different, however: in (5.3), w is some fixed signal, 
and we are measuring the size of the LTI system H, whereas in (4.12), W is a fixed 
weighting transfer function, and we are measuring the size of the signal w. 

5.2.3 H 2 Norm: RMS Response to White Noise 

Consider the RMS response norm above. If S w (cu) « 1 at those frequencies where 
l-ff^jo;)! is significant, then we have 



1 r°° 

-J jH(ja>)\ 2 du> 



,1/2 

IF/11 

Z7T 



It is convenient to think of such a signal as an approximation of a white noise signal, 
a fictitious input signal with S w (w) = 1 for all w (and thus, infinite power, which 
we conveniently overlook). 

This important norm of a stable system is denoted 



'(±r~- " 



\H\U = hr- / \H(j")\ 2 dw 



(we assign ||.ff||2 = °o for unstable H), and referred to as the H 2 norm of H. 

Thus we have the important fact: the H2 norm of a transfer function measures 
the RMS response of its output when it is driven by a white noise excitation. 

The H2 norm can be given another interpretation. By the Parseval theorem, 

1/2 

2) 



\\H\\ 2 = \ h{tfdt 

the L2 norm of the impulse response h of the LTI system. Thus we can interpret 
the H2 norm of a system as the L2 norm of its response to the particular input 
signal 6, a unit impulse. 
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5.2.4 A Worst Case Response Norm 

Let us give an example of measuring the size of a transfer function using the worst 
case response paradigm. Suppose that not much is known about w except that 
HHloo < -Mampi and ||iy||oo < -^siew) i-e., w is bounded by M am pi and slew-rate 
limited by M s i ew . If the peak of the output z is critical, a reasonable measure of 
the size of H is 

H-HJIwc = SUp {H-fftuHoo | |H|oc < M amp i, ||w||oo < M s ie w } . 

In other words, ||-ff|| wc is the worst case (largest) peak of the output, over all inputs 
bounded by M am pi and slew-rate limited by M s i ew . 

5.2.5 Peak Gain 

The peak gain of an LTI system is 

iiu-ii A 1 1 -Halloo , c „x 

||-ff||pk_gn = SUp -rr—r . (5.4) 

IMIoo^o iFlloo 

It can be shown that the peak gain of a transfer function is equal to the Li norm 
of its impulse response: 

y»00 

l|tf||pk_gn= / |M*)l*=IN|l- (5-5) 

Jo 

The peak gain of a transfer function is finite if and only if the transfer function is 
stable. 

To establish (5.5) we consider the input signal 



■{ 



w{t) = j sgn(MT-t)) forO<*<r 

1 otherwise, v ' 



which has ||^||oo = 1 (the sign function, sgn(-), has the value 1 for positive argu- 
ments, and —1 for negative arguments). The output at time T is 

z{T) = I w{T - t)h(i) dt 
Jo 

= / sgn(h(t))h(t) dt 
Jo 

= I \h{t)\dt, 
Jo 

which converges to ||/i||i as T — > oo. So for large T (and H stable, so that 
||-ff||pk_gn < oo), the signal (5.6) yields ||.z||oo/||Hloo near ||/i||i; it is also possible to 
show that there is a signal w such that H^Hoo/H^Hoo = ll^lli- 
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The peak gain of a system can also be expressed in terms of its step response s: 

||#||pk_gn = TV( S ), 

where Tv(/), the total variation of a function /, is defined by 

JV-l 

Tv(/)= sup X;i/(*i)-/(£i + i)|. 



0<ti<...<tjv 



i=l 



Roughly speaking, Tv(/) is the sum of all consecutive peak-to- valley differences in 
/; this is shown in figure 5.3. 
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Figure 5.3 The peak gain of a transfer function is equal to the total 
variation of its step response s, i.e., the sum of all the consecutive peak-to- 
valley differences (shown as arrows) of s. 

It turns out that the peak gain of a SISO transfer function is also the average- 
absolute gain: 



I rj-M II rj-M ££ 

I-" ||pk_gn — II-" ||aa_gn — SUp 



\Hw\ 



HUa^O iPllaa 



(5.7) 



5.2.6 Hoc Norm: RMS Gain 

An important norm of a transfer function is its RMS gain: 

A H-fftullrms 



\H\ 



rms_gn 



= sup 



s ^o 



\w\ 



(5.8) 
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The RMS gain of a transfer function turns out to coincide with its L2 gain, 

||-H"||rms_gn = SUp 2 , (5.9) 

IHh^o IFII2 
and is equal to the maximum magnitude of the transfer function, 

||-H"||rms_gn = SUp \H(jw)\, (5.10) 

when H is stable; for unstable H we have \\H || r ms_gn = oo. For this reason, the RMS 
gain is sometimes called the maximum magnitude norm or the Chebychev norm of 
a transfer function. We note that the right-hand side of (5.10) can be interpreted as 
a worst case response norm of H: it is the largest steady-state peak of the response 
of H to any unit amplitude sinusoid (w(t) = coswt). 

Equations (5.8-5.10) show that four reasonable interpretations of "the transfer 
function H is small" coincide: 

• the RMS value of its output is always small compared to the RMS value of 
its input; 

• the total energy of its output is always small compared to the total energy of 
its input; 

• the transfer function H(jw) has a small magnitude at all frequencies; 

• the steady-state peak of the response to a unit amplitude sinusoid of any 
frequency is small. 

The RMS gain of a transfer function H can be expressed as its maximum mag- 
nitude in the right half of the complex plane: 

||-H"||rms_gn = ||-H"||oo = SUp \H(s)\, (5-H) 

9?s>0 

which is called the H^ norm of H. (Note the very different meaning from ||w||oo> 
the Lqo norm of a signal.) 

Let us establish (5.10) in the case where w is stochastic and H is stable; it is 
not hard to establish it in general. Let S w (tv) denote the power spectral density of 
w. The power spectral density of the output z is then S z (tv) = \H(jw)\ 2 S w (w), and 
therefore 



Ull 2 

I llrms 



l f°° 

= — / S z (u))du) 
Its J-oo 

i r°° 

= — \H{jw)\ 2 S w (w)dw 

27r J-oo 



r °"" 2 i/_ 



< sup \H(jw)\ 2 — / S w (w) dw 



= l|tf|| 2 oclHlr m s- 
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Thus we have for all w with nonzero RMS value 

\\Hw\ 



irms < m 



\W\ 



which shows that ||-ff|| r ms_gn < Halloo- By concentrating the power spectral density 
of w near a frequency w max at which |-ff(jw max )| ~ ||-Hl|oo> we have 



|-Hiu||rms iirri 

-r, r, ~ \\ii 



|| UJ | |rms 

(Making this argument precise establishes (5.10).) 

We can contrast the Hqo and H 2 norms by considering the associated inequality 
specifications. Equation (5.11) implies that the Hqo norm-bound specification 

||-Hl|oo<M (5.12) 

is equivalent to the specification: 

ll-ff^llrms < M for all w with ||u>|| rms < 1. (5.13) 

In contrast, the H 2 norm-bound specification 

\\H\\ 2 < M (5.14) 

is equivalent to the specification 

ll-ff^llrms < M for w a white noise. (5.15) 

The Hqo norm is often combined with a frequency domain weight: 

||#lkoc = IIWHUoo. 

If the weighting transfer function W and its inverse W~ x are both stable, the 
specification ||i?||^,oo < 1 can be expressed in the more classical form: H is stable 
and 

\H[ju)\ < iWiJw)]' 1 for all u, 

depicted in figure 5.4(b). 

5.2.7 Shifted Hoc Norm 

A useful generalization of the Hqo norm of a transfer function is its a-exponentially 
weighted L 2 norm gain, defined by 

Mull A H f H 2 

||-ff||oo,a = SUp 



||a*o IMIa' 
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(a) (b) 

Figure 5.4 An example of a frequency domain weight W that enhances 
frequencies above w = 10 is shown in (a). The specification ||W_ff"||oo < 1 
requires that the magnitude of H(jui) lie below the curve l/|W(ju>)|, as 
shown in (b). In particular, |_ff(,;a>)| must be below — 20dB for w > 12. 
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where w(t) = e at w(t), z(t) = e at z(t), and z = Hw. It can be shown that 

||-ff||oo,a = SUp \H[s)\ = ||-ffa||oo, 
Sts> — a 



(5.16) 



where H a (s) = H(s — a). H a is called the a-shifted transfer function formed from 
H, and the norm || • ||oo,a is called the a-shifted H^ norm of a transfer function. 
This is shown in figure 5.5. 

Erom (5.16), we see that the a-shifted H^ norm of a transfer function is finite 
if and only if the real parts of the poles of H are less than —a. For a < 0, then, 
the shifted H^ norm can be used to measure the size of some unstable transfer 
functions, for example, 



IV(« 



= 1 



(whereas ||l/(s — l)||oo = °°, meaning that its RMS gain is infinite). On the other 
hand if a > 0, the a-shifted Hqo norm-bound specification 

||#||oc,a<M 

(or even just ||.ff||oo,a < °°) guarantees that the poles of H lie to the left of the line 
3Js = — a in the complex plane. 

We can interpret the shifted transfer function H (s — a) as follows. Given a block 
diagram for H that consists of integrators (transfer function 1/s), summing blocks, 
and scaling amplifiers, we replace each integrator with a transfer function l/(s — a) 
(called a "leaky integrator" when a < 0). The result is a block diagram of H(s — a), 
as shown in figure 5.6. In circuit theory, where H is some network function, this is 
called a uniform loading of H. 
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\H{<r + ju)\ 




(a) 



\H\\ 




H{l + ju>)\ 



(b) 



Figure 5.5 The magnitude of a transfer function H is shown in (a). The 
L2 gain of the system is the peak magnitude of H along the line s = ju. 
The exponentially weighted (a = —1) L2 gain of the system is the peak 
magnitude of H along the line s = 1 + jui, as shown in (b). 
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(a) 



(b) 



Figure 5.6 A realization of H(s — a) can be formed by taking each inte- 
grator in a realization of H(s), shown in (a), and adding a feedback of a, as 
shown in (b). 



5.2.8 Hankel Norm 

The Hankel norm of a transfer function is a measure of the effect of its past input on 
its future output, or the amount of energy that can be stored in and then retrieved 
from the system. It is given by 



\H\ 



hankel 



= supH/ z(tf dt\ I w(t) 2 dt<l, w(t) = 0, t>T>o\. 

We can think of w in this definition as an excitation that acts over the time period 
< t < T; the response or ring of the system after the excitation has stopped is 
z(t) for t > T. An example of a past excitation and the resulting ring is shown in 
figure 5.7. 

It is useful to think of the map from the excitation (w(t) for < t < T) to 
ring (z(t) for t > T) as consisting of two parts: first, the mapping of the excitation 
into the state of the system at t = T; and then, the mapping from the state of the 
system at t = T (which "summarizes" the total effect on the future output that the 
excitation can have) into the output for t > T. This interpretation will come up 
again when we describe a method for computing \\H ||hankei- 



5.2.9 Example 1: Comparing Two Transfer Functions 

In this section we will consider various norms of the two transfer functions 

( a), -44.1a 3 + 334a 2 + 1034a + 390 

13 ^ ~ s 6 + 20s 5 + 155s 4 + 586s 3 + 1115s 2 + 1034s + 390' 



H[ h 3 \s) = 



-220s 3 + 222s 2 + 19015s + 7245 



s 6 + 29.1s 5 + 297s 4 + 1805s 3 + 9882s 2 + 19015s + 7245 " 



(5.17) 
(5.18) 



These transfer functions are the I/O transfer functions (T) achieved by the con- 
trollers K ( a ) and K ( b ) in the standard plant example of section 2.4. The step 



104 



Chapter 5 Norms of Systems 



w(t) = for t > T 




-2 2 4 6 8 10 

t-T 
Figure 5.7 The Hankel norm of a transfer function is the largest possible 
square root energy in the output z for t > T, given a unit-energy excitation 
w that stops at t = T. 



responses of H[ 3 and H\ 3 are shown in figure 5.8 and their frequency response 
magnitudes in figure 5.9. The values of various norms of H[ 3 and H{ 3 ' are shown 
in table 5.1. 

Prom the first row of table 5.1 we see that the peak of the response of H[ 3 to 
a step input is about the same as H[ 3 '. Thus, in the sense of peak step response, 
H[ 3 is about the same size as H{ z ' . 

If H[ 3 and H[ 3 ' are driven by white noise, the RMS value of the output of H[ 3 
is less than half that of H{ 3 ' (second row). Figure 5.10 shows an example of the 



Norm 


ffW 

■"13 


■"13 


figure 


|| ' ||pk_step 

II * lb 

II " ll wc 
|| " ||pk_gn 
II " \\oo 
|| ' || hankel 


1.36 
1.17 
1.60 
1.74 
1.47 
1.07 


1.40 
2.69 
1.68 
4.93 
3.72 
2.04 


5.8 

5.10 

5.11 

5.12 

5.9 

5.13 



Table 5.1 The values of six different norms of H{£ and H[ 3 . 
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Figure 5.8 The step responses of the transfer functions in (5.17) and (5.18). 



(b). 



Note that H-ffJSlpkjtep = 1-36, and H-ffJjlpkjtep = 1-40. 




0.01 



w 



100 



Figure 5.9 The magnitudes of the transfer functions in (5.17) and (5.18). 
Note that H-ff^lU = 1A7 > and ll^is'lU = 3 - 72 - 
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outputs of H{ 3 and H{ 3 with a white noise excitation. Thus, in the sense of RMS 
response to white noise, we can say that H[ 3 is about half the size of H[ 3 '. 





(a) 
Figure 5.10 (a) shows a sample of the steady-state response of H\\' to a 

white noise excitation, together with the value \\H{^ H2 = 1.17. (b) shows 

a sample of the steady-state response of H[ 3 to a white noise excitation, 

together with the value H-H^'lh = 2.69. 



Prom the third row, we see that the worst case response of H[ 3 to inputs 
bounded and slew-rate limited by 1 is similar to that of H{ 3 '. Amplitude and slew- 
rate limited input waveforms that produce outputs with peak values close to these 
worst case values are shown in figure 5.11. Thus, in the sense of maximum peak 
output in response to inputs bounded and slew limited by 1, H[ 3 is about the same 

u( b ) 
size as H{ 3 ' . 

Prom the fourth row, we see that the peak output of H[ 3 with a worst case input 

(a) 
bounded by 1 is almost three times larger than H\ 3 ' . This is expected from the 

step response total variation expression for the peak gain (see figures 5.3 and 5.8). 

Input waveforms that produce outputs close to these worst case values are shown 

in figure 5.12. Thus, in the sense of maximum peak output in response to inputs 

bounded by 1, H{ 3 is less than one third the size of H[ 3 ' . 

Prom the fifth row, we see that the RMS gain of H[ 3 is more than twice as 

(a) 

large as the RMS gain of H{ 3 . This can be seen from figure 5.9; input signals that 
result in the largest possible ratio of RMS response to RMS input are sinusoids at 
frequencies w = 0.9 (for H[ 3 ') and w = 6.3 (for H[ 3 ). 

(a) 

Finally, the worst case square root energy in the output of H{ 3 after its unit- 
energy input signal is turned off is about 48% lower than the worst case for H\ 3 
(sixth row of table 5.1). Thus, in the sense of square root energy gain from past 
inputs to future outputs we can say that H[ 3 is about half the size of H[ 3 ' . Fig- 
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t t 

(a) (b) 

Figure 5.11 (a) shows an input signal w with ||to||oo = 1 and ||t£>| 



= 1 



that drives the output of H^ close to \\H\^ || wc = 1.60. (b) shows an input 
signal w with ||tc||oo = 1 and ||m>||oo = 1 that drives the output of H[ 3 close 



(b). 



to \\H£ 



= 1.68. 





t 

(b) 

Figure 5.12 (a) shows an input signal w with ||tc||oo = 1, together with 
the output z produced when H^ is driven by w; z(10) = 1.74 is very close 
to ||.ffi3 || p k_gn. (b) shows an input signal w with ||to||oo = 1, together with 
the output z produced when H\ 3 is driven by w; z(10) = 4.86 is close to 

||^3 ) ||p^n=4.93. 
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ure 5.13 shows unit-energy excitations for t < 5 that produce square root output 
energies for t > 5 close to the Hankel norms of H±£ and H{ 3 ' . 
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(a) (b) 

Figure 5.13 (a) shows a unit-energy input signal w that is zero for t > 5, 
together with the output z when H-^ is driven by w. The square root 

energy in z for t > 5 is close to \\H\^ ||hankei = 1.07. (b) shows a unit- 
energy input signal w that is zero for t > 5, together with the output z 
when H[ 3 is driven by w. The square root energy in z for t > 5 is close to 

11-^13 llhankel = 2.04. 
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5.2.10 Example 2: the Gain of an Amplifier Circuit 

Consider the band pass filter circuit shown in figure 5.14. We will assume that the 
opamp saturates at ±14V. The input of the circuit (which is produced by another 
opamp) is no larger than ±14V (i.e., ||w||oo < 14). We ask the question: can this 
filter saturate? 

Assuming the opamp does not saturate, the transfer function from w to z is 



H(s) = 



-2s/10 4 



(s/10 4 + l) 



(5.19) 



The maximum magnitude of this transfer function is 1.0 (||i?||oo = 1), so, provided 
the opamp does not saturate, the RMS value of the filter output does not exceed 
the RMS value of the filter input. It is tempting to conclude that the opamp in the 
filter will not saturate. 

This conclusion is wrong, however. The peak gain of the transfer function H 
is \\H || p k_gn = 1.47, so there are inputs bounded by, say, ±10V that will drive the 
opamp into saturation. Figure 5.15 gives an example of such an input signal, and 
the corresponding output that would be produced if the opamp did not saturate. 
Since it exceeds 14V, the real filter will saturate with this input signal. 
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5nF 




Figure 5.14 A bandpass filter circuit. The amplifier has very large open- 
loop gain. The output clips at ±14^, and the input lies between ±14V. 
When the circuit is operating linearly the transfer function from w to z is 
given by (5.19). 



> 




t (ms) 
Figure 5.15 If the circuit shown in figure 5.14 did not saturate, the input 
w shown would produce the output z. Even though ||tc||oo = 10V, we have 
Hzlloo = 14.7V. Thus the input w will drive the real circuit in figure 5.14 



into saturation. Of course, ||z| 



since \\H\\ 



= 1. 
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5.3 Norms of MIMO LTI Systems 

Some of the common norms for multiple-input, multiple- out put (MIMO) LTI sys- 
tems can be expressed in terms of the singular values of the n z x n w transfer matrix 
H, which, roughly speaking, give information analogous to the magnitude of a SISO 
transfer function. The singular values of a matrix M 6 C nzXn ™ are defined by 

a;(M) = (A;(M*M)) 1/2 , i = l,...,min{n z ,n w }, (5.20) 

where Aj(-) denotes the ith largest eigenvalue. The largest singular value (i.e., a\) 
is also denoted <7 max - A plot of ai(H(jw)) is called a singular value plot, and is 
analogous to a Bode magnitude plot of a SISO transfer function (an example is 
given in figure 5.17). 

5.3.1 RMS Response to a Particular Noise Input 

Suppose H is stable (i.e., each of its entries is stable), and S w is the power spectral 
density matrix of w. Then 

. 1/2 
I o-|| — I Tp ■ ' ' 



^f H(jw)S w (jw)H(jw)* dw 



5.3.2 H 2 Norm: RMS Response to White Noise 

If S w (cv) « / for those frequencies for which H(jui) is significant, then this norm is 
approximately the H2 norm of a MIMO system: 

(I /"OO \ 1/2 

Tr— / H(jw)H(jw)* dwj . (5.21) 

The H2 norm of H is therefore the RMS value of the output when the inputs are 
driven by independent white noises. 

By Parseval's theorem, the H 2 norm can be expressed as 

1/2 
\H\U = [Tr 



y»CO 

• / h(t)h(t) T dt 



= ££n*«i 



1/2 



=1 *;=! 



where h is the impulse matrix of H. Thus, the H2 norm of a transfer matrix H is 
the square root of the sum of the squares of the H 2 norms of its entries. 
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The H2 norm can also be expressed in terms of the singular values of the transfer 
matrix H: 

1/2 

(5.22) 



™ = (i£|>(tfM)^) • 



where n = min{n z ,n w }. Thus, the square of the H2 norm is the total area under 
the squared singular value plots, on a linear frequency scale. This is shown in 
figure 5.16. 




Figure 5.16 For a MIMO transfer matrix, ||_ff" ||| is proportional to the area 
under a plot of the sum of the squares of the singular values of H (shown 
here for min{n z , n w } = 3). 



5.3.3 Peak Gain 

The peak gain of a MIMO system is 



lo-ll a \\Hw\ 

|-n|| P k_gn= sup — — 



IMU^o 



\w\ 



= max 



/ £IM*)I 

Jo j= l 



dt. 



(5.23) 



For MIMO systems, the peak gain is not the same as the average-absolute gain 
[c.f. (5.7)). The aver age- absolute gain is 

\\Hw\ 



I tt\\ ^ || J - f - u/ ||aa 11 TT j 

\H ||aa_gn = SUp -r—r = \\H 

||»||..#0 iFllaa 



Ipk-gnj 



the peak gain of the LTI system whose transfer matrix is the transpose of H. 
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5.3.4 RMS Gain 

The RMS gain of a MIMO system is important for several reasons, one being that it 
is readily computed from state-space equations. The RMS gain of a MIMO transfer 
matrix is 

||-ff||rms_gn = ||-ff||oo = SUp <7 max (-ff («)), 
9?s>0 

the Hqo norm of a transfer matrix (c./. (5.11), the analogous definition for transfer 
functions). Thus, ||-ff||oo < oo if and only if the transfer matrix H is stable. For 
stable H, we can express the H^ norm as the maximum of the maximum singular 
value over all frequencies: 

||-H"||oc =SUpo- max (i?(jw)), 

as shown in figure 5.17. Note that the other singular values do not affect ||i?||oo- 




100 



u> 



Figure 5.17 The Hoo norm of a stable transfer matrix H is the maxi- 
mum over frequency of the maximum singular value, a\ . The other singular 
values, <72, . . . , cr n , do not affect the Hoc norm. 



5.3.5 Entropy of a System 

In this section we describe a measure of the size of a MIMO system, which is not 
a norm, but is closely related to the H 2 norm and the H^ norm. For 7 > we 
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define the -y-entropy of the system with transfer matrix H as 

( T 2 f°° 

I[H)^\~hj lo & det ( I -1~ 2H ti") H ti")*) d " if|l#lloc<7 
oo if ||-H"||oo > 7 

(c./. (5.21)). The 7-entropy can also be expressed in terms of the singular values as 

-t rtOO n 

— / J] - 7 2 log (1 - {ai{H{JLj))hf) du if HJETHoo < 7 

if ||H||oo>7 



I 7 (H) = { 



2tt 

00 



(c./. (5.22)). This last formula allows us to interpret the 7-entropy of H as a measure 
of its size, that puts a weight — 7 2 log(l — (c/7) 2 ) on a singular value a, whereas 
the H2 norm uses the weight a 2 . This weight function is shown in figure 5.18, with 
a 2 shown for comparison. 



- 7 2 log(l-(cr/7) 2 ) / 


/ a 2 


/ / 



Figure 5.18 The 7-entropy of H is a measure of its size that puts a weight 
— 7 2 log(l — (c/7) 2 ) on a singular value a, whereas the H2 norm uses the 
weight a 2 . 

Since these two weight functions are close when <tj is small compared to 7, we 
see that 



lim JlJH) = \\H\\ 2 . 

7— >oo V 

Prom figure 5.18 we can see that 
yjl^H) > \\H\\ 2 , 



(5.24) 
(5.25) 



114 



Chapter 5 Norms of Systems 



i.e., the square root of the 7-entropy of a transfer matrix is no smaller than its H2 
norm. We also have the more complicated converse inequality 



^I-y(H) < iy-log(l-a2)||tf|| 2 , 



(5.26) 



where a = ||i?||oo/7 < 1- Thus, the relative increase in the square root of the 
7-entropy over the H2 norm can be bounded by an expression that only depends 
on how close the Hqo norm is to the critical value 7. For example, if ||.ff||oo < 7/2 
(a < 0.5), we have ||i?|| 2 < \/l-y{H) < 1.073||i?|| 2 , i.e., if 7 exceeds ||-H"||oo by 6dB 
or more, then the square root of the 7-entropy and ||.ff|| 2 cannot differ by more 
than about 0.6dB. Thus, the H^ norm of a transfer matrix must be near 7 for the 
square root of the 7-entropy to differ much from the H 2 norm. 

An important property of the entropy is that it is readily computed using state- 
space methods, as we will see in section 5.6.5. 

An Interpretation of the Entropy 

Recall that the square of the H 2 norm of a transfer function H is the power of 
its output when it is driven by a white noise. For the case of scalar [i.e., transfer 
function) H, we can give a similar interpretation of the 7-entropy of H as the average 
power of its output when it is driven by a white noise, and a certain random feedback 
is connected around it, as shown in figure 5.19. 



w 



*tT 



H 



Figure 5.19 The 7-entropy of a transfer function H is the average power of 
its output z when it is driven by a white noise w and a random feedback A 
is connected around it. The values of the random feedback transfer function 
A are independent at different frequencies, and uniformly distributed on a 
disk of radius I/7 centered at the origin in C. 



The transfer function from w to z in figure 5.19, with a particular feedback 
transfer function A connected, is H/(l — AH), so the power of the output signal is 

H 



1-AH 

We now assume that A is random, with A(jw) and A(jV) independent for w ^ v, 
and each A(jw) uniformly distributed on the disk of radius I/7 in the complex 
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plane. Then we have 



E 

A 



H 2 



AH 



= I-i(H), (5.27) 



where E denotes expectation over the random feedback transfer functions A. 

Some feedback transfer functions decrease the power in the output, while other 
feedback transfer functions increase it; the inequality (5.25) shows that on average, 
the power in the output is increased by the feedback. The limit (5.24) shows that 
if the feedback is small, then it has little effect on the output power (indeed, (5.26) 
shows that the average effect of the feedback on the output power is small unless 
the Hqo norm of the feedback is close to the inverse of the Hqo norm of H). 

5.3.6 Scaling and Weights 

The discussion of section 4.3.5 concerning scalings and weights for norms of vector 
signals has important implications for norms of MIMO systems; gains especially are 
affected by the scaling used to measure the input and output vector signals. For 
example, let us consider the effect of scaling on the RMS gain of a (square) MIMO 
system. Let D be a scaling matrix (i.e., diagonal). The D-scaled RMS value oiHw 

is ||-ff^||.D,rms = 1 1 -£^ tl ' 1 1 rms i the D-SCaled RMS value of W is ||w||.D,rms = ll-D^llrms- 

Thus, the L>-scaled RMS gain of H is 

||£)i?u;|| rms U-Diy-D^tullrms 

SU P —n~F> — n = SU P 



IMIr^O ||-CH|rms |H| rms #0 IHIrms 

= WDHD- 1 ^, 

the Hqo norm of the diagonally pre- and post-scaled transfer matrix. 

More general transfer matrix weights can be applied, e.g., ||Wpost#Wpre||- 

5.4 Important Properties of Gains 

5.4.1 Gain of a Cascade Connection 

An important property of gains is that the gain of the product of two transfer 
matrices can be bounded in terms of the gains of the individual transfer matrices: 
if || • || gn denotes any gain, then 

11-^2-^1 ||gn < 11-^2 Hgnll-Hlllgn- (5.28) 

This inequality is easily established; it can be seen from figure 5.20. The prop- 
erty (5.28) does not generally hold for norms of LTI systems that are not gains. For 
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35 



s 2 + s + 3 



35 



example, 

Hi = 

Ho = 

s 2 +0.5s + 2 

have ||-ffl||pk_step = 1-18 and ||-ff2 ||pk_step = 1-65, SO that ||-Hl||pk_step||-ff2||pk_step = 

1.95, but H-ffa-ffi ||pk_ste P = 2.80 > 1.95. 

On the other hand, ||ffi||oo = 3, Halloo = 6, and ||i? 2 -ffi||oo = 15.3 < 18, since 
the Hqo norm is the RMS gain. 



w 



H t 



Ho 



-s- Z 



Figure 5.20 The gain of two cascaded transfer matrices is no larger 
than the product of the gains of the transfer matrices, i.e.: ||i?2-ffi|| g n < 

I|tf2|| g n||tfl|| g „. 



5.4.2 Gain of a Feedback Connection 
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Figure 5.21 Two systems connected in a feedback loop. 

Consider the feedback connection shown in figure 5.21. Assuming that this feedback 
connection is well-posed, meaning that det(7 — HiH 2 ) is not identically zero, the 
transfer matrix from w to z is 



G = (I- H x H 2 y x Hi = H^I - H 2 Hi\ 



(5.29) 



A fact that we will need in chapter 10 is that, provided the product of the gains 
of the two transfer matrices is less than one, the gain of G can be bounded. More 
precisely, if 



\Hi\ 



gn 



\H 



2||gn 



< 1 



holds, then the feedback connection is well-posed and we have 

1 ~~ "In": 



2||gn 



(5.30) 



(5.31) 
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(note the similarity to (5.29)). 

The condition (5.30) is called the "small gain" condition, since, roughly speaking, 
it limits the gain around the loop in figure 5.21 to less than one. For this reason, 
the result above is sometimes called the small gain theorem. 

The small gain theorem can be used to establish stability of a feedback connec- 
tion, if the gain || • || gn is such that ||-ff || gn < °o implies that H is stable. The RMS 
gain, for example, has this property: the small gain condition ||i?i||oo 1 1 Jif 2 1 1 00 < 1 
implies that the transfer matrix G is stable, that is, all of its poles have negative 
real part. Similarly, if the gain || • || gn is the a-shifted Hqo norm, then the small 
gain condition ||.ffi||oo,a 1 1 -ff 2 1 1 00 , a < 1 implies that the poles of G have real parts less 
than —a. 

The small gain theorem is easily shown. Suppose that the small gain condi- 
tion (5.30) holds. The feedback connection of figure 5.21 means 

z = Hi_(w + H 2 z) = Hxw + HxH^z. 

Using the triangle inequality, 

INI < \\Hiw\\ + \\H 1 H 2 z\\, 

where || • || is the norm used for all signals. Using the definition of gain and the 
property (5.28), we have 

ll-vll S" II TJ II IL..II 1 1177" TT II ll-vll s^ II TJ II IL..II 1 II ZJ" II II TJ II ll-vll 

INI < ||-rci||gn|M| + ||-ni-n2||gn|N| < ||-ni||gn|M| + ||-ni||gn||-n2||gn|N|, 
so that 

INI (1 ~~ ||-Hl||gn||-ff2||gn) < ||-ffl||gn|M|- 

Using the small gain condition, we have 

|| z || < l|J?l|| g° lltflll. (5.32) 

II II -I [J TT H H V / 

1 — ||- H l||gn||-n2||gn 

Since (5.32) holds for all signals w, (5.31) follows. 

5.5 Comparing Norms 

The intuition that different norms for LTI systems should generally agree about 
which transfer matrices are "small" or "large" is false. There are, however, some 
general inequalities that the norms we have seen must satisfy. 
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5.5.1 Some General Inequalities 

For convenience we will consider norms of SISO systems, i.e., norms of transfer 
functions. 

Since || • || p kjtep and || • ||oo are each worst case peak norms over input signal 
sets that have peaks no greater than one (a unit step in the first case, and unit 
amplitude sinusoids in the second), it follows that these norms will be no larger 
than the peak gain of the system, 

II 7711 *> II 7711 

II llpk_gn _ Halloo 

II 7711 "> II 7711 

II llpk_gn _ H-^ ||pk_step' 

From the definition of the Hankel norm, we can see that it cannot exceed the L2 
gain, which we saw is the Hqo norm, so we have 

II 77"ll <^ II 77"ll 

ll-n ||hankel S H-WHoo- 

It is possible for a system to have a small RMS gain, but a large peak gain. 
However, if H has n poles then the peak gain of H can be bounded in terms of the 
Hankel norm, and therefore, the RMS gain: 

||# Hoc < ||#||pk_gn < (2n + \)\\H || hank el < (2n + \)\\H \\ x . (5.33) 

This means that for low order systems, at least, the peak gain, RMS gain, and 
Hankel norm cannot differ too much. 

5.5.2 Approximating Norms: an Example 

Consider the worst case norm described in section 5.1.3, with the amplitude bound 
and slew-rate limit each equal to one: 

||ff||wc = sup{||ffu;|| 00 | IHIoc < 1, IHIoc < 1}- 

Roughly speaking, the bound and slew-rate limit establish a bandwidth limit of 
about one for the input signal w. We might therefore suspect that we can approxi- 
mate ||-ff || wc by a weighted peak gain, where the weight is some appropriate lowpass 
filter with a bandwidth near one: 

\\H\U « ||W|| plM5n . 

We will show that this intuition is correct: for W(s) = l/(2s + 1), we have 

||#W||pk^n < ||# ||wc < 3||i?^|| pk ^ n (5.34) 

for all transfer functions H. Thus, V3 ||-HW|| p k_gn approximates ||-ff|| W c to within 
±4.8dB. 

To establish (5.34), suppose that wo is a signal with 

IKIloo = 1, \\HWWoWn = ||ffW|| pk _gn 
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(such a wo can be shown to exist). Let wi = Wwq. Then we have 

||w>l||oo < ||W||p k _g n = 1, 

and 



Pi oo = 



2s + 1 



W 



< 



2s + 1 



= 1. 



pk_gn 



Therefore Wi satisfies the amplitude limit ||w||oo < 1 an d slew-rate limit ||^i||oo < 1) 
so we must have H-fftUiHoo < ||i?|| wc . Since HiJiUiHoo = ||-ffW|| p k_gn> this means that 

IliJWll ir < \\H\\ 

II '' llpk_gn _ ll- 1 - 1 ||wc 

We now establish the right-hand inequality in (5.34). Consider w 2 such that 

Halloo < 1, Halloo < 1, ||-H"^2||oo = H-Hllwc 

(such a W2 can be shown to exist). Define w 3 by 
w 3 = 2w 2 +w 2 = W~ 1 w 2 . 



Then \\wz\ 
Hence 



< 2||u; 2 ||oo + Halloo = 3, and H-ffWiusHoo = ||-H"^2||oo = ||-H"| 



\\HW\\ 



pk_gn ^ 



\\HWw 3 \\, 

IWIoc 



\H\ 



which establishes the right-hand inequality in (5.34). 

This example illustrates an interesting tradeoff in the selection of a norm. While 
||-ff|| wc may better characterize the "size" of H in a given situation, the approxima- 
tion ||-ffW|| p k_gn ( or even H-HWHoo; see the previous section) may be easier to work 
with, e.g., compute. If ||w||oo < 1 an d ||^||oo < 1 is only an approximate model of 
possible w's, and the specification ||z||oo < a need only hold within a factor of two 
or so, then 

||W|| plM5n < V3a 

would be an appropriate specification. 

5.6 State-Space Methods for Computing Norms 

The H2, Hankel, and Hqo norms, and the 7-entropy of a transfer matrix are readily 
computed from a state-space realization; methods for computing some of the other 
norms we have seen are described in the Notes and References. In this section we 
assume that 



x = Ax + Bw, z = Cx 
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is a minimal realization of the stable transfer matrix H, i.e., 

H{s) = C{sI-A)~ 1 B. 

The references cited at the end of this chapter give the generalization of the Hqo 
norm computation method to the case with a feed-through term (z = Cx + Dw). 

5.6.1 Computing the H2 Norm 

Substituting the impulse matrix h(t) = Ce At B of H into 

\\H\\\=Ty(J h{tfh{t)dt 
we have 

\\H\\l = Tr (b t ^ e ATt C T Ce M dt B 

= Tr (B T W obs B) , (5.35) 

where 

W obs = I e ATt C T Ce At dt 



^obs — / 
JO 



is the observability Gramian of the realization, which can be computed by solving 
the Lyapunov equation 

A T W ohs + W obs A + C T C = (5.36) 

(see the Notes and References). 

The observability Gramian determines the total energy in the system output, 
starting from a given initial state, with no input: 



/»oo 

a;(0) T ^ obs a;(0) = / z{t) T z{t)dt, 
Jo 



where x = Ax, z = Cx. 

Since Tr(RS) = Tr(SR), the above derivation can be repeated to give an alter- 
nate formula 

||ff|| 2 = (Tr (CW co „trC T ) ) 1/2 , (5.37) 

where 

E„mr= / e At BB T e ATt dt 



-I 

Jo 
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is the controllability Gramian, which can be found by solving the Lyapunov equation 

AW cont v + W contl A T + BB T = 0. (5.38) 

The controllability Gramian determines which points in state-space can be reached 
using an input with total energy one: 

x = Ax + Bw, x(0) = 0, x(T) = x d , J Q T w(t) T w(t) dt < 1, 
for some T and w 



xJWj^Xd < 1. 

Thus, the points in state-space that can be reached using an excitation with total 
energy one is given by an ellipsoid determined by Wc 0n tr- (See chapter 14 for more 
discussion of ellipsoids.) 

5.6.2 Computing the Hankel Norm 

The Hankel norm is readily computed from the controllability and observability 
Gramians via 

(1/2 1/2 \ ' 

-^naxl^contr^obs^contJJ 

= (A ma x(WobsW CO ntr)) 1/2 , (5.39) 

where A max ( - ) denotes the largest eigenvalue. Roughly speaking, the Gramian Wobs 
measures the energy that can be "retrieved" in the output from the system state, 
and Wcontr measures the amount of energy that can be "stored" in the system state 
using an excitation with a given energy. These are the two "parts" of the mapping 
from the excitation to the resulting ring that we mentioned in section 5.2.8; (5.39) 
shows how the Hankel norm depends on the "sizes" of these two parts. 

5.6.3 Computing the H^ Norm 

There is a simple method for determining whether the inequality specification 
1 1 H | |oo < 7 is satisfied. Given 7 > we define the matrix 



M 7 = 



A 1 ~^BB T 

-7" 1 C T C -A T 



(5.40) 



Then we have 

||-ff||oo < 7 •<=>' M 7 has no imaginary eigenvalues. (5-41) 

Hence we can check whether the specification ||i?|| < 7 is satisfied by forming 
M 7 and computing its eigenvalues. The equivalence (5.41) can be used to devise 
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an algorithm that computes ||.ff||oo with guaranteed accuracy, by trying different 
values of 7; see the Notes and References at the end of this chapter. 

The result (5.41) can be understood as follows. \\H W^ < 7 is true if and only if 
for all (j 6 R, 7 2 J — H(jtv)*H(jw) is invertible, or equivalently, the transfer matrix 

G(s) = {l- 7 -*H(- S ) T H(s))- 1 

has no jw axis poles. We can derive a realization of G as follows. A realization of 
H( — s) T (which is the adjoint system) is given by 

H(-sf = (B T ) (si - (-A*))- 1 (~C T ) ■ 

Using this and the block diagram of G shown in figure 5.22, a realization of G is 
given by G(s) = Cg{sI — Ag)~ 1 Bg + Dg, where 

A G =Mv 



B g = 



B 



C G = [ 
D G =L 



7" 



l B T ] 



Since Ag = M 7 , and it can be shown that this realization of G is minimal, the jw 
axis poles of G are exactly the imaginary eigenvalues of M 7 , and (5.41) follows. 
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Figure 5.22 A realization of G(s) = j 2 (-y 2 I - H(-s) T H(s))' 1 . G has no 
imaginary axis poles if and only if the inequality specification ||-ff ||oo < 7 
holds. 

The condition that M 7 not have any imaginary eigenvalues can also be expressed 
in terms of a related algebraic Riccati equation (ARE), 

A T X + XA + j~ 1 XBB T X +j~ 1 C T C = 0. (5.42) 

(Conversely, M 7 is called the Hamiltonian matrix associated with the ARE (5.42).) 
This equation will have a positive definite solution X if and only if M 7 has no 
imaginary eigenvalues (in which case there will be only one such X). If ||.ff||oo < 7i 
we can compute the positive definite solution to (5.42) as follows. We compute any 
matrix T such that 



T^M-yT = 



An 




A 12 
A 22 
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where An is stable (i.e., all of its eigenvalues have negative real part). (One good 
choice is to compute the ordered Schur form of M 7 ; see the Notes and References 
at the end of this chapter.) We then partition T as 



T = 



Tn T12 
Tii Tii 



and the solution X is given by 

X = T 21 T~ 1 \ 
The significance of X is discussed in the Notes and References for chapter 10. 

5.6.4 Computing the a-Shifted Hoc Norm 

The results of the previous section can be applied to the realization 

x = (A + al)x + Bw, z = Cx 

of the a-shifted transfer matrix H(s — a). We find that ||.ff||oo,a < 7 holds if and 
only if the eigenvalues of A all have real part less than —a and the matrix 

A + al ^~ X BB T 



-7 



-inT 



C T C -A T -aI 



has no imaginary eigenvalues. 

5.6.5 Computing the Entropy 

To compute the 7-entropy of H, we first form the matrix M 7 in (5.40). If M 7 
has any imaginary eigenvalues, then by the result (5.41), ||.ff||oo > 7 and hence 
I-y(H) = 00. If M 7 has no imaginary eigenvalues, then we find the positive definite 
solution X of the ARE (5.42) as described above. The 7-entropy is then 

I 7 (H) = 7 Tr (B T XB) . (5.43) 

Prom (5.42), the matrix X = 7X satisfies the ARE 

A T X + XA + j~ 2 XBB T X + C T C = 0. 

Note that as 7 — ► 00, this ARE becomes the Lyapunov equation for the observability 
Gramian (5.36), so the solution X converges to the observability Gramian Wobs- 
From (5.43) and (5.35) we see again that I 1 (H) — > ||.ff||2 as 7 — ► 00 (see (5.24) in 
section 5.3.5). 
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Notes and References 

Norms of Signals and Systems for Control System Analysis 

The use of norms in feedback system analysis was popularized in the 1960's by researchers 
such as Zames [Zam66b], Sandberg [San64], Narendra [NG64], and Willems [Wil69], 
although some norms had been used in control systems analysis before these papers. For 
example, the square of the H2 norm is referred to as I y in the 1957 book [NGK57]; 
chapter 7 of the 1947 book [JNP47], written by Philips, is entitled RMS-Error Criterion 
in Servomechanism Design. 

A thorough reference on norms for signals and systems in the context of control systems 
is Desoer and Vidyasagar [DV75]. This book contains general and precise definitions of 
many of the norms in this and the previous chapter. Mathematics texts covering many of 
the norms we have seen include Kolmogorov and Fomin [KF75] and Aubin [Aub79]. The 
bold H in the symbols H2 and H M stands for the mathematician G. H. Hardy. 

The observation that the total variation of the step response is the peak gain of a transfer 
function appears in Lunze [Lun89]. 

Singular Value Plots 

Singular value plots are discussed in, for example, Callier and Desoer [CD82a], Ma- 
ciejowski [Mac89], and Lunze [Lun89]. Analytical properties of singular values and nu- 
merical algorithms for computing them are covered in Golub and Van Loan [GL89]. 

The Entropy Interpretation 

The simple interpretation of the entropy, as the average square of the H2 norm when a 
random feedback is connected around a transfer function, has not appeared before. We do 
not know how to generalize this interpretation to the case of a transfer matrix, although 
it is likely that there is a similar interpretation. 

To prove the result (5.27) we consider an arbitrary h 6 C, and A a complex random 
variable uniformly distributed on the disk of radius I/7. We then have 

E 



h 


2 2 /•1/7 p 2 ™ 
* Jo Jo 

= r - 7 2 iog(i- 
1 00 


h 


2 


1 - Ah 


1 - re ie h 
\h?h 2 ) 


\h\<l 
W>7 



(the integration over 6 can be evaluated by residues). By integrating over u>, the re- 
sult (5.27) follows. 

Comparing Gains 

The result (5.33) is from Boyd and Doyle [BD87]. 

Small Gain Theorem 

The small gain theorem from section 5.4.2 is a standard mathematical result. Applications 
of this result (and extensions) to feedback system analysis are discussed in Desoer and 
Vidyasagar [DV75] and Vidyasagar [Vid78]; see also chapter 10. 
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State-Space Norm Computations 

Using the controllability or observability Gramian to compute H2 norms is standard; 
see for example [Fra87]. The Lyapunov equations that arise are nowadays solved nu- 
merically by special methods; see Bartels and Stewart [BS72] and Golub, Nash, and 
Van Loan [GNL79]. Tables of formulas for the H2 norm of a transfer function, in terms 
of its numerator and denominator coefficients, can be found in Appendix E2 of Newton, 
Gould, and Kaiser [NGK57]. These tables are based on a method that is equivalent to 
solving the Lyapunov equations. (Professor T. Higgins points out that there are several 
errors in these tables.) 

The result on Hankel norm computation can be found in, e.g., [Glo84, §2.3] and [Fra87]. 
The result of section 5.6.3 is from Boyd, Balakrishnan, and Kabamba [BBK89]; see also 
Robel [Rob89] and Boyd and Balakrishnan [BB90]. The method for computing the en- 
tropy appears in Mustafa and Glover [MG90] and Glover and Mustafa [GM89]. 

A discussion of solving the ARE can be found in [AM90]. The method of solving the 
ARE based on the Schur form is discussed in Laub [Lau79]; see also the articles [AL84, 
D008I]. Numerical issues of these and other state-space computations are discussed in 
Laub [Lau85]. 

Computing Some Other Norms 

Computing the peak gain or peak-step norm from a state-space description of an LTI 
system is more involved than computing the entropy or the H2 or Hoc norm. Perhaps 
the simplest method is to numerically integrate {i.e., solve) the state-space equations 
to obtain the impulse or step response matrix. ||-ff|| p k_gn could then be computed by 
numerical integration of the integrals in the formula (5.23). Similarly, \\H || p k_ste P could be 
determined directly from its definition and the computed step response matrix. 

For other norms, e.g., ||_ff"|| wc , there is not even a simple formula like (5.23). Nevertheless 
it can be computed in several ways; we briefly mention some here. It can be expressed as 



ff||wc = SUp I / h(t)w(t) dt IHloo < M amp l, IHloo < M,iew > , 



(5.44) 



which is an infinite- dimensional convex optimization problem that can be solved using 
the methods described in chapters 13-15. Alternatively, the (infinite-dimensional) dual 
problem can be solved: 



||-ff||wc = min M amp i||A||i+M sl 

H € R 
A : R.+ — * R. 



L 



H+ / (h(r) - A(t)) dr 



This dual problem is unconstrained. 

The computation of ||_ff"|| wc can also be formulated as an optimal control problem. We 
include w as an additional state, so the dynamics are 

x = Ax + Bw, w = u, z = Cx, x(0) = w(0) = 0. 

The peak and slew-rate limits on w can be enforced as the control and state constraints 

|«(*)| < M s lew, \w(t)\ < M amp l. 
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The objective is then simply z(T) with T free. The solution of this free end-point optimal 
control problem (which always occurs as T — ► oo) yields ||_ff"|| wc . This optimal control 
problem has linear dynamics and convex state and control constraints, so essentially all 
numerical methods of solution will find the value \\H || wc (and not some local minimum). 
See, for example, the books by Pontryagin [Pon62], Bryson and Ho [BH75, CH.7], and 
the survey article by Polak [Pol73]. 

The Figures of Section 5.2.9 

The input signals in figure 5.11 were computed by finely discretizing (with first-order hold) 
the optimization problem 



/' 

1 Jo 



max / h(10 - t)w(t) dt 

HI- - 

Iwlloo <1 



and solving the resulting linear program. 

The input signals in figure 5.12 were computed from (5.6) with T = 10. 

The unit-energy input signals in figure 5.13 give the largest possible square root output en- 
ergy for t > 5. These input signals were computed by finding the finite-time controllability 
Gramian 



Jo 



Atr>r,T A 1 t i. JIr 5A TJT 5A 1 

e BB e dt = Wcontr - e Wcont r e 



where H{s) = C(sl - A) _1 B. 

If A is the largest eigenvalue of W 1 ' 2 W bsW 1 ' 2 and z is the corresponding eigenvector, 
with ||z||2 = 1, then the input signal 



w(t) = | 



B T e AT(S-t) w -l,2 z f()r < t < 5) 

otherwise, 



has unit energy, and drives the system state to W 1 ' 2 z at t = 5. (It is actually the smallest 
energy signal that drives the state from the origin to W 1 ' 2 z in 5 seconds.) The output for 
t > 5, 

z(t) = Ce Ait - 5) W 1/2 z, 
has square root energy 

\ 1/2 = {\^(w 1/2 w ohs w 1 ' 2 )) 1/2 

(c.f. (5.39)). 



Chapter 6 

Geometry of Design 
Specifications 



In this chapter we explore some geometric properties that design specifications 
may have, and define the important notion of a closed-loop convex design spec- 
ification. We will see in the sequel that simple and effective methods can be 
used to solve controller design problems that are formulated entirely in terms of 
closed-loop convex design specifications. 



6.1 Design Specifications as Sets 

% will denote the set of all n z x n w closed-loop transfer matrices; we may think 
of Ji as the set of all conceivable candidate transfer matrices for the given plant. 
Recall from chapter 3 that design specifications are boolean functions or predicates 
on Ti. With each design specification T>i we will associate the set 7ii of all transfer 
matrices that satisfy it: 

Hi = {H EH | H satisfies X>;} . 

Of course, there is a one-to-one correspondence between subsets of Ti (i.e., sets 
of transfer matrices) and design specifications. For this reason we will also refer to 
subsets of % as design specifications. Whether the predicate (e.g., <f) os (H) < 6%) 
or subset ({H \ <f) os (H) < 6%}) is meant should be clear from the context, if it 
matters at all. 

The boolean algebra of design specifications mentioned in chapter 3 corresponds 
exactly to the boolean algebra of subsets, with some of the correspondences listed 
in table 6.1. 
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Design specifications 


Sets of transfer matrices 


H satisfies T>\ 


H eTii 


2>i is stronger than V 2 


Tii C Ti 2 


T>\ is weaker than T>2 


Tii =? Ti.2 


DiAD 2 


n 1 nn 2 


T>\ is infeasible 


■Hi = 


T>\ is feasible 


Wi^0 


Vi is strictly stronger than V 2 


Tii ^ ^2) ^1 7^ Tii 



Table 6.1 Properties of design specifications and the corresponding sets of 
transfer matrices. 



6.2 Affine and Convex Sets and Functionals 

In this section we introduce several important definitions. 

We remind the reader that Ti is a vector space: roughly speaking, we have a 
way of adding two of its elements (i.e., n z x n w transfer matrices) and multiplying 
one by a real scalar. In a vector space, we have the important concepts of a line 
and a line segment. 

If H, H 6 Ti and A 6 R, we will refer to XH + (1 — X)H as an affine combination 
of H and H. We may think of an affine combination as lying on the line passing 
through H and H, provided H ^ H. If < A < 1, we will refer to the affine 
combination \H + (1 — A).ff as a convex combination of H and H. We may think of 
a convex combination as lying on the line segment between H and H. The number 
A measures the fraction of the line segment that we move from H towards H to 
yield \H + (1 — A)i? . This can be seen in figure 6.1. 

Definition 6.1: Tii C Ti is affine if for any H, H C Tii, and any A 6 R, XH + 
{1-X)H eTii. 

Thus a set of transfer matrices is affine if, whenever two distinct transfer matrices 
are in the set, so is the entire line passing through them. 

Definition 6.2: Tii ^ Ti is convex if for any H, H £ Tii, and any A € [0,1], 
XH + {1-X)H E Tii. 

Thus a set of transfer matrices is convex if, whenever two transfer matrices are 
in the set, so is the entire line segment between them. 
These notions are extended to functionals as follows: 



Definition 6.3: A functional (f) on Ti is affine if for any H, H 6 Ti, and any 
A 6 R, (f)(XH + (1 - X)H) = X<f>(H) + (1 - X)(f>{H). 
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Figure 6.1 The line passing through H and _ff consists of all affine combi- 
nations of H and H, i.e., XH + (1 — X)H, A € R. The line segment between 
H and H consists of all convex combinations of H and H , i.e., XH+(1—X)H 
for < A < 1. 



A functional is affine if the graph of its values along any line in Ti is a line in 
R ; an example is shown in figure 6.2. 

Definition 6.4: A functional (f) on Ti is convex if for any H, H 6 7i, and any 

\E[0,l],<f){\H + {l-\)H)<\<f){H) + {l-\)(f>{H). 

A functional is convex if the graph of its values along any line segment in 7i lies 
below the line segment joining its values at the ends of the line segment. This is 
shown in figure 6.3. 

Under very mild conditions we can test convexity of a set or functional by just 
checking the case A = 1/2. Specifically, a set 7ii is convex if and only if whenever 
H G Tii an d H € Tii, the average (H + H)/2 is also in Ti\. Similarly, a functional 
<j> is convex if and only if, for every H and H we have 

<t>((H + H)/2)<(4>(H) + <t>(H))/2. 

Since (H + H)/2 can be interpreted as the midpoint of the line segment between H 
and H, this simple test is called the midpoint rule. 



6.2.1 Some Important Properties 

We collect here some useful facts about affine and convex sets and functionals. 
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+ 




A 
Figure 6.2 A functional <f> is affine if for every pair of transfer matrices H 
and H the graph of <f>(\H + (1 — \)H) versus A is a straight line passing 
through the points (0, <f>{H)), (1, 4>{H)). 



I 



+ 




A 
Figure 6.3 A functional <f> is convex if for every pair of transfer matrices 
H and H the graph of <f> along the line XH + (1 — X)H lies on or below a 
straight line through the points (0, <f>(H)), (1, 4>(H)), i.e., in the shaded 
region. 
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• Affine implies convex. If a set or functional is affine, then it is convex: being 
affine is a stronger condition than convex. If the functionals <j> and —<j> are 
both convex, then <j> is affine. 

• Intersections. Intersections of affine or convex sets are affine or convex, re- 
spectively. 

• Weighted- sum Functional. If the functionals (pi, ■ ■ ■ ,<Pl are convex, and Ai > 
0, . . . , \l > 0, then the weighted-sum functional 

</>wt_sum(-H") = Ai0i(H) + • • • + Xl4>l(H) 

is convex (see section 3.6.1). 

• Weighted-max Functional. If the functionals <f>i,...,<f>L are convex, and Ai > 
0, . . . , \l > 0, then the weighted-max functional 

0wt_max(-H") =max{Ai0i(.ff), ..., X L (f>L{H)} 

is convex (see section 3.6.3). 

The last two properties can be generalized to the integral of a family of convex 
functionals and the maximum of an infinite family of convex functionals. Suppose 
that for each a 6 I (1 is an arbitrary index set), the functional <f) a is convex. Then 
the functional 

4>{H) = sap {<f> a (H) | a el} 

is convex. 

We now describe some of the relations between sets and functionals that are 
convex or affine. A functional equality specification formed from an affine functional 
defines an affine set: if <f> is affine and a G R, then 

{H | 4(H) = a} 

is affine. Similarly, if <j> is convex and a 6 R, then the functional inequality specifi- 
cation 

{H | <f>(H) < a}, 

called a sub-level set of <j>, is convex. 

The converse is not true, however: there are functionals that are not convex, 
but every sub-level set is convex. Such functionals are our next topic. 
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6.2.2 Quasiconvex Functionals 

Definition 6.5: A functional <j> onJi is quasiconvex if for each a G R, the func- 
tional inequality specification {H \ (f)(H) < a} is convex. 

An equivalent definition of quasiconvexity, that has a form similar to the defi- 
nition of convexity, is: whenever H, H € 7i and A 6 [0, 1], 

4>{\H + (1 - X)H) < max{(f)(H), 4>{H)}. 

Prom definition 6.4 we can see that every convex functional is quasiconvex. 

The values of a quasiconvex functional along a line in Ji is plotted in figure 6.4; 
note that this functional is not convex. A quasiconvex function of one variable is 
called unimodal, since, roughly speaking, it cannot have two separate regions where 
it is small. 



I 
i— i 

+ 
-Q- 




Figure 6.4 A functional <j) is quasiconvex if for every pair of transfer ma- 
trices H and H the graph of <f> along the line XH + (1 — X)H lies on or below 
the larger of <f>(H) and <f>(H), i.e., in the shaded region. 



A positive weighted maximum of quasiconvex functionals is quasiconvex, but a 
positive-weighted sum of quasiconvex functionals need not be quasiconvex. 

There is a natural correspondence between quasiconvex functionals and nested 
families of convex specifications, i.e., linearly ordered parametrized sets of specifica- 
tions. Given a quasiconvex functional <f>, we have the family of functional inequality 
specifications given by 7{ a = {H \ 4>(H) < a}. This family is linearly ordered: Ti a 
is stronger than Ti 13 if a < (3. 



6.2 Affine and Convex Sets and Functionals 



133 



Conversely, suppose we are given a family of convex specifications 7{ a , indexed 
by a parameter a 6 R, such that 7i a is stronger than 'hP if a < (3 (so the family of 
specifications is linearly ordered). The functional 



family (i?) = inf {a I HeH a } 



(6.1) 



('/'family {H ) = oo if H £ 7i OL for all a) simply assigns to a transfer matrix H the index 
corresponding to the tightest specification that it satisfies, as shown in figure 6.5. 
This functional (/>f am iiy is easily shown to be quasiconvex, and its sub-level sets are 
essentially the original specifications: 

n a C {H | family (i?) < a} C H a+e 

for any positive e. 




Figure 6.5 Five members of a nested family of convex sets are shown. 
Such nested families define a quasiconvex function by (6.1); for the given 
transfer matrix H, we have 0f am iiy(-ff) = 2. 



6.2.3 Linear Transformations 

Convex subsets of 7{ and convex functionals on Ji are often defined via linear trans- 
formations. Suppose that V is a vector space and L : % — ► V is a linear function. 
If V is a convex (or affine) subset of V, then the subset of Ji defined by 

U!={H | L{H) e V} 

is convex (or affine). Similarly if t[> is a convex (or quasiconvex or affine) functional 
on V, then the functional <j> on % defined by 

4>{H) = 4>(L(H)) 

is convex (or quasiconvex or affine, respectively). 

These facts are easily established from the definitions above; we mention a few 
important examples. 
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Selecting a Submatrix or Entry of H 

A simple but important example is the linear transformation that "selects" a sub- 
matrix or entry from H. More precisely, V is the vector space of p x q transfer 
matrices and L is given by 

L{H) = E T Z HE W 

where E z 6 R n * xp and E w 6 R n ™ X9 ; the columns of E z and E w are unit vectors. 
Thus L(H) is a submatrix of H (or an entry of H if p = q = 1); the unit vectors in E z 
and E w select the subsets of regulated variables and exogenous inputs, respectively. 
If i\) is a functional on p x q transfer matrices, a functional <f) on % is given by 

4>(H) = 4>(L(H)) = ^{E T Z HE W ). (6.2) 

Informally, <f) results from applying ^ to a certain submatrix of H; a convex (or 
quasiconvex or affine) functional of an entry or submatrix of H yields a convex (or 
quasiconvex or affine) functional of H. 

To avoid cumbersome notation, we will often describe functionals or specifica- 
tions that take as argument only an entry or submatrix of H, relying on the reader 
to extend the functional to Ji via (6.2). 

Time Domain Responses 

Let V consist of scalar signals on R + , and let L be the transformation that maps 
a transfer matrix into the unit step response of its i, k entry: 



L(H) = s 
where, for t > 0, 

2ir J_ x ju 



Since L is linear, we see that a convex constraint on the step response of the i, k 
entry of a transfer matrix is a convex specification on H. 

A similar situation occurs when L maps H into its response to any particular 
input signal w pa rt: 

L(H) = Hw pait 

where V is the set of all n z -component vector signals. If 2 spec is a convex subset of 
V, then the specification 

{H | Hw palt 6 -Z S pec} 

is a convex subset of %. 
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6.3 Closed-Loop Convex Design Specifications 

Many design specifications have the property that the set of closed-loop transfer 
matrices that satisfy the design specification is convex. We call such design speci- 
fications closed-loop convex: 

Definition 6.6: A design specification V is closed-loop convex if the set of closed- 
loop transfer matrices that satisfy V is convex. 

One of the themes of this book is that many design specifications are closed-loop 
convex. 

6.3.1 Open Versus Closed-Loop Formulation 

We noted in section 3.1 that it is possible to formulate design specifications in terms 
of the (open-loop) controller transfer matrix K, instead of the closed-loop transfer 
matrix H, as we have done. In such a formulation, a design specification is a 
predicate on candidate controllers, and the feasibility problem is to find a controller 
K that satisfies a set of design specifications. Such a formulation may seem more 
natural than ours, since the specifications refer directly to what we design — the 
controller K. 

There is no logical difference between these two formulations, since in sensible 
problems there is a one-to-one correspondence between controllers and the closed- 
loop transfer matrices that they achieve. The difference appears when we consider 
geometrical concepts such as convexity: a closed-loop convex specification will gen- 
erally not correspond to a convex set of controllers. In chapters 13-16, we will see 
that convexity of the specifications is the key to computationally tractable solution 
methods for the controller design problem. The same design problems, if expressed 
in terms of the controller K, have specifications that are not convex, and hence do 
not have this great computational advantage. 

6.3.2 Norm-Bound Specifications 

Many useful functionals are norms of an entry or submatrix of the transfer matrix 
H. A general form for a design specification is the norm-bound specification 

l|tfxx-#xx S |l<a, (6-3) 

where || • || is some norm on transfer functions or transfer matrices (see chapter 5) 
and iJ xx is a submatrix or entry of H (see section 6.2.3). We can interpret i? xx s as 
the desired transfer matrix, and the norm || • || used in (6.3) as our measure of the 
deviation of i? xx from the desired transfer matrix i? xx s . We often have i? xx s = 0, 
in which case (6.3) limits the size of i? xx . 
We now show that the functional 



4>(H) = \\H„-H : 



des I 

xx ■" xx I 
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is convex. Let H and H be any two transfer matrices, and let < A < 1. Now, 
using the triangle inequality and homogeneity property for norms, together with 
the fact that A and 1 — A are nonnegative, we see that 

4>(XH + (1 - X)H) = ||Atf xx + (1 - A)tf xx - H^\\ 

= \\X(H„ - HZ™) + (1 - A)(tf xx - H^)\\ 
< \\X(H„ - H^)\\ + ||(1 - A)(tf xx - H??)\\ 
= X\\H„ - H^\\ + (1 - A)||iJ xx - H*'\\ 
= X<f>(H) + (1 - X)</>(H), 

so the functional <j> is convex. Since <j> is convex, the norm-bound specification (6.3) 
is convex. In chapters 8-10 we will see that many specifications can be expressed 
in the form (6.3), and are therefore closed-loop convex. (We note that the entropy 
functional defined in section 5.3.5 is also convex, although it is not a norm.) 

An important variation on the norm-bound specification (6.3) is the specification 

||-ffxx|| < oo, (6.4) 

which requires that an entry or submatrix of H have a finite norm, as measured by 
|| • || . This specification is affine, since if i? xx and .ff xx each satisfy (6.4) and A C R, 
then we have 

||AiJ xx + (1 - A)iJ xx || < |A|||tf xx || + |1 - A|||iJ xx || < oo, 

so that A.ff xx + (1 — A).ff xx also satisfies (6.4). 

If we use the H^ norm for || • ||, (6.4) is the specification that .ff xx be stable, 
i.e., the poles of i? xx have negative real parts. Similarly, if we use the a-shifted Hqo 
norm as || • ||, (6.4) is the specification that the poles of .ff xx have real parts less 
than —a. These specifications are therefore affine. 



6.4 Some Examples 

In chapters 7-10 we will encounter many specifications and functionals that are 
affine or convex; in most cases we will simply state that they are affine or convex 
without a detailed justification. In this section we consider a few typical specifica- 
tions and functionals for our standard example plant (see section 2.4), and carefully 
establish that they are affine or convex. 

6.4.1 An Affine Specification 

Consider the specification that the closed-loop transfer function from the reference 
input r to y p have unity gain at w = 0, so that a constant reference input results in 
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j/ p = r in steady-state. The set of 2 x 3 transfer matrices that corresponds to this 
specification is 

n dc = {H |tf 13 (0) = l}. (6.5) 

We will show that this set is affine. Suppose that H, H £ T^do so that i?i 3 (0) = 
# 13 (0) = 1, and A 6 R. Then the transfer matrix H\ = XH + (1 — X)H satisfies 

ffAis(O) = AH 13 (0) + (1 - A)iJ 13 (0) = A + (1 - A) = 1, 

and hence H\ € Tide- 

Alternatively, we note that the specification 7i dc can be written as the functional 
equality constraint 

n dc = {H \ 4> dc (H) = i} , 

where <f) dc (H) = i?i 3 (0) is an affine functional. 

6.4.2 Convex Specifications 

Consider the specification 2> ac t_eff introduced in section 3.1: "the RMS deviation of 
u due to the sensor and process noises is less than 0.1". The corresponding set of 
transfer matrices is 

"ftact_eff = {H | <f>act_ett{H) < 0.1 } , 

where we defined the functional 

(1 /"oo \ 1/2 

— / {\H 21 (jw)\ 2 S pTOC (w) + \H 22 (jw)\ 2 S sensoT (w)) dwj , 

and 5 P roc and S sen s are the power spectral densities of the noises n proc and n sen sor5 
respectively. 

To show that <f) ac t_eff is a convex functional we first express it as a norm of the 
submatrix [H 2 i H 22 ] of H: 

, n i r i o 



H 



1 

1 




w, 



proc 









w„ 



<Pact_es{H) = 

where \W pi0 c(ju)\ 2 = £ P roc(w) and \W se nsoi(j^)\ 2 = S se n SOI {^)- Prom the results 
of sections 6.2.3 and 6.3.2 we conclude that (/> ac t_eff is a convex functional, and 
therefore 7^ a ct_eff is a convex specification. 

As another example, consider the specification V os introduced in section 3.1: 
"the step response overshoot from the command to y p is less than 10%". We 
can see that the corresponding set of transfer matrices, 7i os , is convex, using the 
argument in section 6.2.3 with i = 1, k = 3 and the convex subset of scalar signals 



V = {s:R+->R| s(t) < 1.1 for t > 0} . 
This is illustrated in figure 6.6. 



(6.6) 
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t 

Figure 6.6 The set V in (6.6) is convex, since if the step responses S13 and 
S13 do not exceed 1.1, then their average, (si3 + si3)/2, also does not exceed 
1.1. By the argument in section 6.2.3, the specification 7Y s is convex. 

6.4.3 A Quasiconvex Functional 

We will see in chapter 8 that several important functionals are quasiconvex, for 
example, those relating to settling time and bandwidth of a system. We describe 
one such example here. 

Consider the stability degree of a transfer matrix, defined by 

0stab_de g (-fO = max {Up I p is a pole of H } . 

The functional (/> s tab_deg is quasiconvex since, for each a, 

{H I </>stab_de g (-H") <a} = {H I ||-H"||oo,/3 < 00 for (3 < -a} 

(recall that || • ||oo,/3 is the /3-shifted Hqo norm described in section 5.2.7), and we 
saw above that the latter specification is affine. (/> s tab_deg is not convex, however: 
for most values of A, we have 

0stab_deg(A# + (1 - A).ff) = max |(/) stab _ de g(-H"), c/>stab_deg(#") j • 



6.5 Implications for Tradeoffs and Optimization 

When the design specifications are convex, and more generally, when a family of 
design specifications is given by convex functional inequalities, we can say much 
more about the concepts introduced in chapter 3. 
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6.5.1 Performance Space Geometry 

Suppose that in the multicriterion optimization problem described in section 3.5, 
the hard constraint 2\ard and the objective functionals <j>\,...,<j>L are convex. Then 
the region of achievable specifications in performance space, 



A 



= |a € R L Dhard A V°\ A ... A X>°* is achievable } , (6.7) 



is also convex. 

To see this, suppose that a and a each correspond to achievable specifications. 
Then there are transfer matrices H and H that satisfy X>hard and also, for each k, 
l<k<L, 

4>k{H) < a k , <t>k{H) < a k . 

Now suppose that < A < 1. The transfer matrix XH + (1 — X)H satisfies the hard 
constraint 2>hard> since 2>hard is convex, and also, for each k, 1 < k < L, 

<t> k {\H + (1 - X)H) < Xa k + (1 - A)a fc> 

since the functional <f) k is convex. But this means that the specification Aa + (1 — A) a 
corresponds to an achievable specification, which verifies that A is convex. 

An important consequence of the convexity of the achievable region in perfor- 
mance space is that every Pareto optimal specification is optimal for the classical 
optimization problem with weighted-sum objective, for some nonnegative weights 
(c./. figure 3.8). Thus the specifications considered in multicriterion optimization or 
classical optimization with the weighted-sum or weighted-max objectives are exactly 
the same as the Pareto optimal specifications. 

In the next section we discuss another important consequence of the convexity 
of A (which in fact is equivalent to the observation above). 

6.6 Convexity and Duality 

The dual function t[> defined in section 3.6.2 is always concave, meaning that —i\) is 
convex, even if the objective functionals and the hard constraint are not convex. To 
see this, we note that for each transfer matrix H that satisfies Dhardj the function 
"4>h of A defined by 

V-ff (A) = -XtMH) *l4>l(H) 

is convex (indeed, linear). Since — 1[> can be expressed as the maximum of this family 
of convex functions, i.e., 

— tp(X) = max{^ij(A) | H satisfies Dhard } , (6-8) 

it is also a convex function of A (see section 6.2.1). 
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The dual function can be used to determine specifications that are not achiev- 
able, and therefore represent a limit of performance (recall the discussion in sec- 
tion 1.4.1). Whenever A > 0, meaning A € R + , and a 6 R , we have 



V>(A) > a T \ =^ D hard A D" 1 A ... A V a , L is unachievable 



(6.9) 



To establish (6.9), suppose that a corresponds to an achievable specification, i.e., 
£*hard A T> a } A ... A VV 1 is achievable. Then there is some transfer matrix H that 
satisfies 2>hard and (f)i(H) < a-i for each i. Therefore whenever A > 0, 

■0(A) = min{Ai(/>i(.ff) H \- \ L 4> L (H) \ H satisfies X> hard } 

< \ l( t>i(H) + ■ ■ ■ + \l4>l(H) 

< Ai<zi H hA L a L 

= a T A. 

The implication (6.9) follows. 
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Figure 6.7 The shaded region corresponds to specifications that satisfy 
^(A) > a T A, and hence by (6.9) are unachievable. The specification 2? x is 
unachievable, but cannot be proven unachievable by (6.9), for any choice of 
A. See also figure 3.8. 

The specifications that (6.9) establishes are unachievable have the simple geo- 
metric interpretation shown in figure 6.7, which suggests that when the region of 
achievable specifications is convex, (6.9) rules out all unachievable specifications as 
A varies over all nonnegative weight vectors. This intuition is correct, except for a 
technical condition. More precisely, when Dhard an d each objective functional (pi is 
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convex, and the region of achievable specifications in performance space is closed, 
then we have: 



the specification corresponding to a is achievable 
there is no A > with ip(\) > a T X. 



(6.10) 



The equivalence (6.10) is called the convex duality principle, or a theorem of 
alternatives, since it states that exactly one of two alternative assertions must hold. 
The geometrical interpretation of the duality principle can be seen from figure 6.8: 
(6.10) asserts that every specification is either achievable (i.e., lies in the shaded 
region to the upper right), or can be shown unachievable using (6.9) (i.e., lies in 
the shaded region to the lower left) for some choice of A. 



achievable. 




V>(A) > a T \ 
unachievable 



«1 

Figure 6.8 Unlike figure 6.7, in this figure the region of achievable spec- 
ifications is convex, so, by varying A, (6.9) can be made to rule out every 
specification that is unachievable. 

The convex duality principle (6.10) is often expressed in terms of pairs of con- 
strained optimization problems, e.g., minimizing the functional (pi subject to the 
hard constraint and functional inequality specifications for the remaining function- 
als 02, • • • j 4>l- We define 

A 



{mh) 



H satisfies 2> hard A X>°* A 



NV a , L 

<PL 



}• 



"dual = max{V'(A) - a 2 A 2 



a L X L | A > 0, Ai =1}. 



(6.11) 

(6.12) 



The optimization problem on the right-hand side of (6.11) is called the primal 
problem; the right-hand side of (6.12) is called a corresponding dual problem. The 



142 Chapter 6 Geometry of Design Specifications 

convex duality principle can then be stated as follows: if the hard constraint and 
the objective functionals are all convex, then we have a pi i = oiduai- (Here we do 
not need the technical condition, provided we interpret the min and max as an 
infimum and supremum, respectively, and use the convention that the minimum of 
an infeasible problem is oo.) 

For the case when there are no convexity assumptions on the hard constraint 
or the objectives, then we still have a pi { > aduai> but strict inequality can hold, in 
which case the difference is called the duality gap for the problem. 
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Notes and References 

See the Notes and References for chapters 13 and 14 for general books covering the notion 
of convexity. 

Closed-Loop Convex Specifications 

The idea of a closed-loop formulation of controller design specifications has a long history; 
see section 16.3. In the early work, however, convexity is not mentioned explicitly. 

The explicit observation that many design specifications are closed-loop convex can be 
found in Salcudean's thesis [Sal86], Boyd et al. [BBB88], Polak and Salcudean [PS89], 
and Boyd, Barratt, and Norman [BBN90]. 

Optimization 

Many of the Notes and References from chapter 3 consider in detail the special case 
of convex specifications and functionals, and so are relevant. See also the Notes and 
References from chapters 13 and 14. 

Duality 

The fact that the nonnegatively weighted-sum objectives yield all Pareto optimal specifica- 
tions is shown in detail in Da Cunha and Polak [CP67], and in chapter 6 of Clarke [Cla83]. 
The results on convex duality are standard, and can be found in complete detail in, e.g., 
Barbu and Precupanu [BP78] or Luc [Luc89]. 

A Stochastic Interpretation of Closed-Loop Convex Functionals 

Jensen's inequality states that if is a convex functional, H £ H, and i? s toch is any 
zero-mean 7Y-valued random variable, then 

<P(H)<E<P(H + H stoch ), (6.13) 

i.e., zero- mean random fluctuations in a transfer matrix increase, on average, the value of 
a convex functional. In fact, Jensen's inequality characterizes convex functionals: if (6.13) 
holds for all (deterministic) H and all zero-mean H st0 ch, then <j) is convex. 
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Chapter 7 

Realizability and Closed-Loop 
Stability 



In this chapter we consider the design specifications of realizability and internal 
(closed-loop) stability. The central result is that the set of closed-loop transfer 
matrices realizable with controllers that stabilize the plant is affine and readily 
described. This description is referred to as the parametrization of closed-loop 
transfer matrices achieved by stabilizing controllers. 



7.1 Realizability 



An important constraint on the transfer matrix H € Ji is that it should be the closed- 
loop transfer matrix achieved by some controller K, in other words, H should have 
the form P ZW +P ZU K{I 
as realizability: 



PyuK) 



1 P yw for some K. We will refer to this constraint 



U 



rlzbl 



^ {H \H = P ZW + P ZU K(I 



PyuK) 



l P yw for some if} 



(7.1) 



We can think of ?^ r i z bi as expressing the dependencies among the various closed-loop 
transfer functions that are entries of H. 

As an example, consider the classical SASS 1-DOF control system described in 
section 2.3.2. The closed-loop transfer matrix is given by 



H = 



Po 



1 + PoK 
PoK 



PpK 

1 + P K 

K 



PoK 

1 + PoK 

K 



1 + P K 1 + P K 1 + P K 



Po(l-T) 
-T 



-T T 

-T/P T/P 
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where T is the classical closed- loop I/O transfer function. Hence if H is the closed- 
loop transfer matrix realized by some controller K, i.e., H € 7tf r izbi> then we must 
have 

H 12 = #21, (7.2) 

Po#22 = #21 > (7.3) 

#ii - PoH 21 = P , (7.4) 

#13 = "#12, (7.5) 

#23 = "#22. (7.6) 

It is not hard to show that the five explicit specifications on H given in (7.2-7.6) 
are not just implied by realizability; they are equivalent to it. Roughly speaking, 
we have only one transfer function that we can design, K, whereas the closed-loop 
transfer matrix H contains six transfer functions. The five constraints (7.2-7.6) 
among these six transfer functions make up the missing degrees of freedom. 

The specifications (7.2-7.6) are affine, so at least for the classical SASS 1-DOF 
controller, realizability is an affine specification. In fact, we will see that in the 
general case, ?^ r i z bi is affine. Thus, realizability is a closed-loop convex specification. 

To establish that ?^ r i z bi is affine in the general case, we use a simple trick that 
replaces the inverse appearing in (7.1) with a simpler expression. Given any n u x n y 
transfer matrix K, we define the n u x n y transfer matrix R by 

R = K{I-P yu K)~ 1 . (7.7) 

This correspondence is one-to-one: given any n u x n y transfer matrix R, the n u x n y 
transfer matrix K given by 

K = {I + RP yu )~ 1 R (7.8) 

makes sense and satisfies (7.7). 

Hence we can express the realizability specification as 

"ftrizbi = {# | # = Pzw + PzuRPyw for some n u x n y R} . (7.9) 

This form of "W r izbi can be given a simple interpretation, which is shown in figure 7.1. 
The transfer matrix R can be thought of as the "controller" that would realize the 
closed- loop transfer matrix H if there were no feedback through our plant, i. e., P yu = 
(see figure 7.1(b)). Our trick above is the observation that we can reconstruct 
the controller K that has the same effect on the true plant as the controller R that 
operates on the plant with P yu set to zero. Variations on this simple trick will 
appear again later in this chapter. 

Prom (7.9) we can establish that T^rizbi is affine. Suppose that H, H € T^rizbi- 
Then there are two n u x n y transfer matrices R and R such that 

H = P + P RP 

±± — -* zw ^ -* zu ±Xj± yw 
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+ 

■*rr 



2/w 



-R 



h>- z 



(b) 

Figure 7.1 The closed-loop transfer matrix H can be realized by the feed- 
back system (a) for some K if and only if it can be realized by the system 
(b) for some transfer matrix R. In (b) there is no feedback. 



Let A 6 R. We must show that the transfer matrix H\ = XH + (1 — X)H is also 
realizable as the closed-loop transfer matrix of our plant with some controller. We 
note that 



H x = P zw + P zu RxP t 



ywi 



where 



R x = XR + (1 - X)R. 

This shows that Hx 6 T^rizbi- 

We can find the controller Kx that realizes the closed-loop transfer matrix Hx 
using the formula (7.8) with Rx- If K and K are controllers that yield the closed- 
loop transfer matrices H and H, respectively, the controller that realizes the closed- 
loop transfer matrix Hx is not XK + (1 — X)K; it is 



K X = {A + XB)- 1 {C + XD) 



(7.10) 



where 



A = I + K(I- P yu K) 



-ip 



yui 
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B = K(I- P yu KY 1 P yu - K(I - PyuKyPyu, 

C = K(I-P yu K)-\ 

D = K{I- PyuK)- 1 - K(I - P yu K)-\ 

The special form of K\ given in (7.10) is called a bilinear or linear fractional de- 
pendence on A. We will encounter this form again in section 7.2.6. 

7.1.1 An Example 

We consider the standard plant example of section 2.4. The step responses from the 
reference input r to y p and u for each of the controllers K ( a ) and K ( b ) of section 2.4 
are shown in figures 7.2(a) and (b). Since ?^ r i z bi is affine, we conclude that every 
transfer matrix on the line passing through H^ and H^ b \ 

\H^) + (! _ A)#( b ), A 6 R, 

can be realized as the closed-loop transfer matrix of our standard plant with some 
controller. Figures 7.2(c) and (d) show the step responses from r to y p and u of five 
of the transfer matrices on this line. 

The average of the two closed-loop transfer matrices (A = 0.5) is realized by 
the controller K0.5, which can be found from (7.10). Even though both K^ a > and 
K ( b ) are 3rd order, K . 5 turns out to be 9th order. From this fact we draw two 
conclusions. First, the specification that H be the transfer matrix achieved by a 
controller K of order no greater then n, 

A / H = P ZW + P ZU K{I - P yu K)- 1 P yw \ 

rtrlzbl,n - yi for gome K with order ( K ) < n j . 

is not in general convex (whereas the specification T^rizbij which puts no limit on the 
order of K , is convex). Our second observation is that the controller K0.5, which 
yields a closed-loop transfer matrix that is the average of the closed-loop transfer 
matrices achieved by the controllers K^ and K( h \ would not have been found by 
varying the parameters (e.g., numerator and denominator coefficients) in K^ and 
if( b ). 



7.2 Internal Stability 

We remind the reader that a transfer function is proper if it has at least as many 
poles as finite zeros, or equivalently, if it has a state-space realization; a transfer 
function is stable if it is proper and all its poles have negative real part; finally, a 
transfer matrix is stable if all of its entries are stable. We noted in section 6.4.1 
that these are affine constraints on a transfer function or transfer matrix. 
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Figure 7.2 (a) shows the closed-loop step responses from r to y p for the 
standard example with the two controllers K^ a ' and K^ h ' . (b) shows the step 
responses from r to u. In (c) and (d) the step responses corresponding to 
five different values of A are shown. Each of these step responses is achieved 
by some controller. 



7.2.1 A Motivating Example 



Consider our standard example SASS 1-DOF control system described in section 2.4, 
with the controller 



K(s) = 



36 + 33s 



10 -s 

This controller yields the closed-loop I/O transfer function 

_ 33s + 36 _ 33s + 36 

^ ~~ s 3 + 10s 2 + 33s + 36 ~~ (s + 3) 2 (s + 4)' 

which is a stable lowpass filter. Thus, we will have y p « r provided the reference 
signal r does not change too rapidly; the controller K yields good tracking of slowly 
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varying reference signals. 

If realizability and good tracking of slowly varying reference signals are our 
only specifications, then if is a good controller. The potential problem with this 
controller can be seen by examining the whole closed-loop transfer matrix, 



H = 



10 -s 33s + 36 33s + 36 



(s + 3) 2 (s + 4) (s + 3) 2 (s + 4) (s + 3) 2 (s + 4) 
33s + 36 s 2 (33s + 36)(10 + s) s 2 (33s + 36)(10 + s) 



(s + 3) 2 (s + 4) (s + 3) 2 (s + 4)(10-s) (s + 3) 2 (s + 4)(10 - s) 



The entries H22 and H23, which are the closed- loop transfer functions from the 
sensor noise and reference input to u, are unstable: for example, a reference input 
with a very small peak can cause the actuator signal to have a very large peak, a 
situation that is probably undesirable. 

So with the controller K, the I/O transfer function T is quite benign, even 
desirable, but the closed-loop system will probably have a very large actuator signal. 
For a classical design approach, in which the requirement that the actuator signal 
not be too large is not explicitly stated (i.e., it is side information), this example 
provides a "paradox": the I/O transfer function is acceptable, but the controller is 
not a reasonable design. 

This example shows the importance of considering all of the closed-loop transfer 
functions of interest in a control system, i.e., H, and not just the I/O transfer 
function T. In our framework, there is no paradox to explain: the controller K can 
be seen to be unacceptable by examining the whole closed-loop transfer matrix H, 
and not just T. 

This phenomenon of a controller yielding a stable I/O transfer function, but with 
other closed-loop transfer functions unstable, is called internal instability. The qual- 
ifier internal stresses that the problem with the design cannot be seen by examining 
the I/O transfer function alone. 

Various arguments have been made to explain the "paradox" of this example, 
i.e., why our controller K is not an acceptable design. They include: 

1. The unstable plant zero at s = 10 is canceled by the controller pole at s = 10. 
Such unstable pole-zero cancellations between the plant and controller cannot 
be allowed, because a slight perturbation of the plant zero, e.g., to s = 9.99, 
will cause the I/O transfer function to become unstable. 

2. A state-space description of the closed-loop system will be unstable (it will 
have an eigenvalue of 10), so for most initial conditions, the state will grow 
larger and larger as time progresses. The unstable mode is unobservable from 
y p , which is why it does not appear as a pole in the I/O transfer function. 

These are valid arguments, but in fact, they correspond to different "new", previ- 
ously unstated, specifications on our system. In addition to the specifications of 
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realizability and stability of the I/O transfer function, (1) is a sensitivity specifica- 
tion, and (2) requires that other signals (the components of the state vector) should 
not grow too large when the initial conditions are nonzero. Since these specifica- 
tions are probably necessary in any real control system, they should be explicitly 
included in the specifications. 

7.2.2 The Desoer-Chan Definition 

Desoer and Chan gave a definition of internal stability of a closed-loop system 
that rules out the problem with our example in section 7.2.1 and other similar 
pathologies. The definition is: 

Definition 7.1: The closed-loop system with plant P and controller K is internally 
stable if the four transfer matrices 



H„, 



= K(I - PyvK^Pyu, 



H UV2 = K(I 
H yvi = i 1 
Hy»2 = {I 



lp 



yu 
PyuK) 
PyuK) 



yui 



(7.11) 
(7.12) 
(7.13) 
(7.14) 



are stable. In this case we say the controller K stabilizes the plant P. 



These transfer matrices can be interpreted as follows. Suppose that V\ and v 2 
are an input-referred process noise and a sensor noise, respectively, as shown in 
figure 7.3. Then the four transfer matrices (7.11-7.14) are the closed-loop transfer 
matrices from these noises to u and y. So, roughly speaking, internal stability 
requires that a small process or sensor noise does not result in a very large actuator 
or sensor signal. 
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Figure 7.3 Sensor and actuator noises used in the formal definition of 
internal stability. K stabilizes P if the transfer matrices from v\ and V2 to 
u and y are all stable. 



The specification of internal stability can be made in our framework as follows. 
We must include sensor and actuator-referred process noises in the exogenous input 
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signal w, and we must include u and y in the regulated variables vector z, in which 
case the four transfer matrices (7.11-7.14) appear as submatrices of the closed-loop 
transfer matrix H. Internal stability is then expressed as the specification that 
these entries of H be stable, which we mentioned in chapter 6 (section 6.4.1) is an 
affine specification. 

It seems clear that any sensible set of design specifications should limit the effect 
of sensor and process noise on u and y. Indeed, a sensible set of specifications will 
constrain these four transfer matrices more tightly than merely requiring stability; 
some norm of these transfer matrices will be constrained. So sensible sets of spec- 
ifications will generally be strictly tighter than internal stability; internal stability 
will be a redundant specification. We will see examples of this in chapter 12: for 
the LQG and H^ problems, finiteness of the objective will imply internal stability, 
provided the objectives satisfy certain sensibility requirements. See the Notes and 
References at the end of this chapter. 

7.2.3 Closed-loop Affineness of Internal Stability 

We now consider the specification that H is the closed-loop transfer matrix achieved 
by a controller that stabilizes the plant: 



"^stable = { H 



-{' 



for some K that stabilizes P f * \ • ) 



Of course, this is a stronger specification than realizability. 

Like "ftrizbi, "^stable is also affine: if K and K each stabilize P, then for each 
A 6 R, the controller K\ given by (7.10) also stabilizes P. In the example of 
section 7.1.1, the controllers K^ and K ( b ) each stabilize the plant. Hence the five 
controllers that realize the five step responses shown in figure 7.2 also stabilize the 
plant. 

We can establish that Testable is affine by direct calculation. Suppose that con- 
trollers K and K each stabilize the plant, yielding closed-loop transfer matrices H 
and H, respectively. We substitute the controller K\ given in (7.10) into the four 
transfer matrices (7.11-7.14), and after some algebra we find that 
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Thus, the four transfer matrices (7.11-7.14) achieved by K\ are affine combinations 
of those achieved by K and k. Since the right-hand sides of these equations are all 
stable, the left-hand sides are stable, and therefore K\ stabilizes P. 

We can use the same device that we used to simplify our description of 7^ r izbi- 
The four transfer matrices (7.11-7.14) can be expressed in terms of the transfer 
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matrix R given in (7.7): 






J\yl Jry U J\ J -Lyu — ■*"j/uj 
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Hence we have 
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(7.16) 

(7.17) 
(7.18) 
(7.19) 



(7.20) 



We will find this description useful. 

7.2.4 Internal Stability for a Stable Plant 

If the plant P is stable, then in particular P yu is stable. It follows that if R is stable, 



then so are RP yu , I + P yu R, and (/ + P yu R)P yu . Hence we have 

"^stable = {Pzw + PzuRPyw \ R Stable} . 



(7.21) 



This is just our description of ?^ r i z bi, with the additional constraint that R be stable. 
Given any stable R, the controller that stabilizes P and yields a closed-loop 



transfer matrix H = P zw + P zu RP yw is 



K = (I + RP 



yu) 



l R. 



(7.22) 



Conversely, every controller that stabilizes P can be expressed by (7.22) for some 
stable R. 



7.2.5 Internal Stability via Interpolation Conditions 

For the classical SASS 1-DOF control system (see section 2.3.2), the specification of 
internal stability can be expressed in terms of the I/O transfer function T, although 
the specification is not as simple as stability of T alone (recall our motivating 
example). We have already noted that a closed- loop transfer matrix H that is 
realizable has the form 



H = 



P (l-T) 
-T 

Po 




-T T 

T/P T/P 

Po -1 1 

-1 -1/Po 1/Po 



+ T 



(7.23) 



where T is the I/O transfer function. We will describe Stable as the set of transfer 
matrices of the form (7.23), where T satisfies some additional conditions. 
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Let pi, . . . ,p n be the unstable poles of Po (i.e., the poles of Po that have nonneg- 
ative real part) and let zi,...,z m be the unstable zeros of Po [i.e., the zeros of Po 
that have nonnegative real part); we will assume for simplicity that they are distinct 
and have multiplicity one. Let r denote the relative degree of P , i.e., the difference 
between the degrees of its numerator and its denominator. Then a transfer matrix 
H is achievable with a stabilizing controller if and only if it has the form (7.23), 
where T satisfies: 

1. T is stable, 

2. T( Pl ) = ---=T(p n ) = l, 

3. T{ Zl ) = ■ ■ ■ = T{z m ) = 0, and 

4. the relative degree of T is at least r. 

These conditions are known as the interpolation conditions (on T; they can also 
be expressed in terms of S or other closed- loop transfer functions). The interpolation 
conditions can be easily understood in classical control terms. Condition 2 reflects 
the fact that the loop gain P$K is infinite at the unstable plant poles, and so we 
have perfect tracking (T = 1) at these frequencies. Conditions 3 and 4 reflect the 
fact that there is no transmission through P at a frequency where P has a zero, 
and thus T = at such a frequency. (Internal stability prohibits an unstable pole 
or zero of Po from being canceled by a zero or pole of K.) 

The interpolation conditions are also readily understood in terms of our descrip- 
tion of Testable given in (7.20). Substituting the plant transfer matrix for the 1-DOF 
control system (2.10) into (7.20) and using R = T/Po we get: 

^stable = {H of form (7.23) | T, T/P , (1 - T)P are stable} . (7.24) 

Assuming T is stable, T/P will be stable if T vanishes at Z\,...,z m and in addition 
T has relative degree at least that of P ; in other words, T/P is stable if conditions 
1, 3, and 4 of the interpolation conditions hold. Similarly, (1 — T)P will be stable 
if T is stable and 1 — T vanishes at p\,...,p n (i.e., conditions 1 and 2 of the 
interpolation conditions hold). 

The interpolation conditions are the earliest description of Testable) and date 
back at least to 1955 (see the Notes and References at the end of this chapter for 
details). 

7.2.6 General Free Parameter Representation 

In the general case there is a free parameter description of the set of closed-loop 
transfer matrices achievable with stabilizing controllers: 

"^stable = {T\ + T 2 QT 3 | Q is a stable n u x n y transfer matrix} , (7.25) 
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where Ti, T 2 , and T3 are certain stable transfer matrices that depend on the plant. 
Q is referred to as the parameter in (7.25), not in the sense of a real number that is 
to be designed (e.g., the integrator time constant in a PI controller), but rather in 
the sense that it is the free parameter in the description (7.25). We saw a special 
case of this form already in the example of the stable plant — in that case, P zw , P zu 
and P yw are possible choices for Ti, T 2 and T 3 , respectively. 

The controller that stabilizes the plant and yields closed-loop transfer matrix 
H = T\ + T2QT3 has the linear fractional form 

Kq = {A + BQ)- 1 ^ + DQ) (7.26) 

where A, B, C, D are certain stable transfer matrices related to Ti, T 2 , and T 3 . 
Thus the dependence of Kq on Q is bilinear (c.f. equation (7.10)). 

It is not hard to understand the basic idea behind the free parameter represen- 
tation (7.25) of the set of achievable closed-loop transfer matrices (7.20), although 
a complete derivation is fairly involved (see the Notes and References at the end of 
this chapter). 

We consider the subspace of n u x n y transfer matrices given by 

o = \o I o, -r yu o, & -Lyui -Lyu&-Lyu are stable j . 

The basic idea is that an S 6 <S must have the appropriate zeros that cancel the 
unstable poles of P yu . These zeros can be arranged by multiplying a stable transfer 
matrix on the left and right by appropriate stable transfer matrices D and D: 



S = J DQD Q is stable! 



D and D are not unique, but any suitable choice has the property that if Q is stable, 
then each of DQD, P yu DQD, DQDP yu , and P yu DQDP yu are stable. We shall not 
derive the form of D and D. 

By comparing (7.20) and (7.25) we see that one possible choice for T 2 and T 3 
in (7.25) is 

T 2 = P ZU D 
T3 = DP yw . 

T\ can be taken to be any closed-loop transfer matrix achieved by some stabilizing 
controller. The references cited at the end of this chapter contain the complete 
details. 



7.3 Modified Controller Paradigm 

The descriptions of Testable given in the previous sections can be given an interpre- 
tation in terms of modifying a given nominal controller that stabilizes the plant. 
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Given one controller K noin that stabilizes the plant, we can construct a large fam- 
ily of controllers that stabilize the plant, just as the formula (7.10) shows how to 
construct a one-parameter family of controllers that stabilize the plant. 
The construction proceeds as follows: 

• We modify or augment the nominal controller K noin so that it produces an 
auxiliary output signal e (of the same size as y) and accepts an auxiliary input 
signal v (of the same size as u) as shown in figure 7.4. This augmentation is 
done in such a way that the closed-loop transfer matrix from v to e is zero 
while the open- loop controller transfer matrix from y to u remains K nom . 

• We connect a stable n u x n y transfer matrix Q from e to v as shown in 
figure 7.5, and collect K nom and Q together to form a new controller, K. 
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H ev = closed loop 

Figure 7.4 The nominal controller -Knom is augmented to produce a signal 
e and accept a signal v. The closed-loop transfer function from v to e is 0. 

The intuition is that K should also stabilize P, since the Q system we added 
to ifnom is stable and "sees no feedback", and thus cannot destabilize our system. 
However, Q can change the closed-loop transfer matrix H. To see how Q affects the 
closed-loop transfer matrix H, we define the following transfer matrices in figure 7.4: 

• Ui is the closed- loop transfer matrix from w to z, 

• U2 is the closed- loop transfer matrix from v to z, 

• U3 is the closed-loop transfer matrix from w to e. 
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Figure 7.5 Modification of nominal controller K noln with a stable transfer 
matrix Q. 



Since the transfer matrix from v to e in figure 7.4 is zero, we can redraw figure 7.5 
as figure 7.6. Figure 7.6 can then be redrawn as figure 7.7, which makes it clear 
that the closed-loop transfer matrix H resulting from our modified controller K is 
simply 



H = U 1 + U 2 QU 3 , 



(7.27) 



which must be stable because Q, Ui, TJi and J7 3 are all stable. 

It can be seen from (7.27) that as Q varies over all stable transfer matrices, H 
sweeps out the following affine set of closed-loop transfer matrices: 



ftmcp = {U x + U 2 QU 3 I Q stable} . 

Of course, 7^ mcp Cj ^stable- This means that a (possibly incomplete) family of 
stabilizing controllers can be generated from the (augmented) nominal controller 
using this modified controller paradigm. 

If the augmentation of the nominal controller is done properly, then the modified 
controller paradigm yields every controller that stabilizes the plant P, in other 
words, Timcp = "^stable- In this case, Ui, U2, and U3 can be used as Ti, T2, and T3 
in the free parameter representation of Testable given in equation (7.25). 
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Figure 7.6 Figure 7.5 redrawn. 
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Figure 7.7 Figure 7.6 redrawn. 



7.3.1 Modified Controller Paradigm for a Stable Plant 

As an example of the modified controller paradigm, we consider the special case of 
a stable plant (see section 7.2.4). Since the plant is stable, the nominal controller 
if nom = stabilizes the plant. 

How do we modify the zero controller to produce e and accept vl One obvious 
method is to add v into u, and let e be the difference between y and P yu u, which 
ensures that the closed-loop transfer matrix from v to e is zero, as required by the 
modified controller paradigm. This is shown in figure 7.8. 

From figure 7.8 we see that 

Ui = P Z W1 

U2 = Pzu, 

Uz = P yw . 



To apply the second step of the modified controller paradigm, we connect a stable 
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Figure 7.8 One method of extracting e and injecting v when the plant is 
stable. 
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Figure 7.9 The modified controller paradigm, for a stable plant, using the 
augmented controller shown in figure 7.8. 
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Q as shown in figure 7.9. so that the closed-loop transfer matrix is 

H = U 1 + U 2 QU 3 . 

Thus the set of closed-loop transfer matrices achievable by the modified controller 
shown in figure 7.9 is 

Wmcp = \*zw t "iiiv-M/m I ^4 stable j . 

The expression here for 7i mcp is the same as the expression for Testable i n equa- 
tion (7.21) in section 7.2.4. So in this case the modified controller paradigm gener- 
ates all stabilizing controllers: any stabilizing controller K for a stable plant P can 
be implemented with a suitable stable Q as shown in figure 7.9. 

The reader can also verify that the connection of Q with the augmented nominal 
controller yields K = (I + QP yu )~ 1 Q — exactly the same formula as (7.22) with Q 
substituted for R. 



7.4 A State-Space Parametrization 

A general method of applying the modified controller paradigm starts with a nom- 
inal controller that is an estimated- state feedback. The estimated-state-feedback 
controller is given by 

u = -K stb x, (7.28) 

where if s fb is some appropriate matrix (the state-feedback gain) and x is an estimate 
of the component of x due to u, governed by the observer equation 

x = Apx + B u u + L est (y-C y x), (7.29) 

where L est is some appropriate matrix (the estimator gain). The transfer matrix of 
this controller is thus 

K nom(s) = -K s fb{sl - Ap + BuKsfo + L est C y )~ Lest- 

-f^nom will stabilize P provided if s fb and L es t are chosen such that Ap — B u K s a, 
and Ap — L est C y are stable, which we assume in the sequel. 

To augment this estimated-state-feedback nominal controller, we inject v into 
u, before the observer tap, meaning that (7.28) is replaced by 

u = -K s{b x + v, (7.30) 

and therefore the signal v does not induce any observer error. For the signal e we 
take the output prediction error: 

e = y- C y x. (7.31) 
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Figure 7.10 The modified controller paradigm as applied to a nominal 
estimated-state- feedback controller -Knom- v is added to the actuator signal 
u before the observer tap, and e is the output prediction error. With the 
stable Q realization added, the modified controller is called an observer- 
based controller. 
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This is shown in figure 7.10. 

The requirement that the closed-loop transfer matrix from v to e be zero is 
satisfied because the observer state error, x — x, is uncontrollable from v, and 
therefore the transfer matrix from v to x — x is zero. The transfer matrix from v 
to e is C y times this last transfer matrix, and so is zero. 

Applying the modified controller paradigm to the estimated-state-feedback con- 
troller yields the observer-based controller shown in figure 7.10. The observer-based 
controller is just an estimated-state-feedback controller, with the output prediction 
error processed through a stable transfer matrix Q and added to the actuator signal 
before the observer tap. 

In fact, this augmentation is such that the modified controller paradigm yields 
every controller that stabilizes the plant. Every stabilizing controller can be real- 
ized (likely nonminimally) as an observer-based controller for some choice of stable 
transfer matrix Q. 

Prom the observer-based controller we can form simple state-space equations for 
the parametrization of all controllers that stabilize the plant, and all closed-loop 
transfer matrices achieved by controllers that stabilize the plant. 

The state-space equations for the augmented nominal controller are, from (7.29- 
7.31), 



x = (A P - B u K stb - L est C y )x + Lest V + B u v 
u = -K s{b x + v 
e = y - C y x. 



(7.32) 
(7.33) 
(7.34) 



The state-space equations for the closed-loop system with the augmented con- 
troller are then found by eliminating u and y from (7.32-7.34) and the plant equa- 
tions (2.19-2.21) of section 2.5: 

x = A P x - B u K s fbX + B w w + B u v 

x = L est C y x + (A P - B u K s{b - L est C y )x + L est D yw w + B u v 

z = C z x- D zu K stb x + D zw w + D zu v 

6 == \J yX LsyX "T" lJyyjVJ . 

The transfer matrices T\, T2, and T3 can therefore be realized as 



where 



' 2i(s) T 2 (s) " 
. T 3 {s) 


— Ot 


(si - A T )~ 


A T = 


Ap —B u K s fb 

LestCy Ap — _D u iV s fb — L 


£}rp = 


B w 

l^est-LJyw 


B u 
B u 





(7.35) 



est^y 



7.5 Some Generalizations of Closed-Loop Stability 



165 



C7 1 — 



D T = 



C z 



-D zu K s fb 
— G„ 



*-* zw D zu 



If Q has state-space realization 

±q = AqXq +B Q e 
v = CqXq +D Q e, 



(7.36) 
(7.37) 



then a state-space realization of the observer-based controller can be found by elim- 
inating e and v from the augmented controller equations (7.32-7.34) and the Q 
realization (7.36-7.37): 



x = (Ap — B u K s r, — L est C y — B u DqC v )x 

+B u Cqxq + (Lest + B u D Q )y 
xq = -BqC v x + AqXq + B Q y 
u = -(-K" s fb + DqC v )x + Cqxq + D Q y 



so that 



K (s) = C K (sI - A K Y X B K + D K , 



(7.38) 
(7.39) 
(7.40) 

(7.41) 



where 



A K = 



B k = 



Ap — B u K s ft, — L est C y — B u DqC y B u Cq 
—BqC v Aq 

Lest + B u Dq 
Bn 



Cr = [ ~K s fb - DoCy Cq ] 
D K =D Q . 

Some algebra verifies that the closed-loop transfer matrix H given by (2.27) of 
section 2.5 does indeed equal T\ + T2QT3. 



7.5 Some Generalizations of Closed-Loop Stability 

So far, our discussion in this chapter has been built around the notion of a stable 
transfer function, i.e., a transfer function for which each pole p satisfies Sftp < 0. We 
saw in chapter 5 that stability is equivalent to several other important properties of a 
transfer function, e.g., finiteness of its peak or RMS gain. In fact, the material in this 
chapter can be adapted to various generalized notions of stability. The references 
discuss these ideas in a general setting; we will describe a specific example in more 
detail. 
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Instead of stability, we consider the requirement that each pole of a transfer 
function should satisfy ffip < —0.1 and |Sp| < — 3?p. We will call such transfer 
functions (/-stable ((/ stands for generalized). In classical terminology, (/-stability 
guarantees a stability degree and a minimum damping ratio for a transfer function, 
as illustrated in figure 7.11. 




Figure 7.11 A transfer function is (/-stable if its poles lie in the region 
to the left. In classical control terminology, such transfer functions have a 
stability degree of at least 0.1 and a damping ratio of at least 1/V2. All 
of the results in this chapter can be adapted to this generalized notion of 
stability. 

We say that a controller K (/-stabilizes the plant P if every entry of the four 
transfer matrices (7.11-7.14) is (/-stable (c./. definition 7.1). It is not hard to show 
that 7^5-stabiej the specification that the closed-loop transfer matrix is achievable 
by a (/-stabilizing controller, is affine; in fact, we have 



7^6-stable — < 



*zw ~r *zu-m-yw 



R, 



are (/-stable 



which is just (7.20), with "(/-stable" substituted for "stable". 

For the SASS 1-DOF control system, the specification Tig. stable can be expressed 
in terms of the interpolation conditions described in section 7.2.5, with the following 
modifications: condition (1) becomes "T is (/-stable", and the list of poles and zeros 
of Po must be expanded to include any poles and zeros that are (/-unstable, i.e., lie 
in the right-hand region in figure 7.11. 
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There is a free parameter representation of Testable: 

"ftg-stabie = {T\ + T2QT3 \ Q is a (/-stable n u x n y transfer matrix} , 

which is (7.25) with "(/-stable" substituted for "stable". This free parameter rep- 
resentation can be developed from state-space equations exactly as in section 7.4, 
provided the state-feedback and estimator gains are chosen such that Ap — B u K s n, 
and Ap — L est C y are (/-stable, i.e., their eigenvalues lie in the left-hand region of 
figure 7.11. 
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Notes and References 

Realizability 

Freudenberg and Looze [FL88] refer to some of the realizability constraints (e.g., S + 
T = 1 in the classical 1-DOF control system) as algebraic constraints on closed-loop 
transfer functions. In contrast, they refer to constraints imposed by stability of T and the 
interpolation conditions as analytic. 

Internal Stability 

Arguments forbidding unstable pole-zero cancellations can be found in any text on classical 
control, e.g. [Oga90, P606-607]. For MAMS plants, finding a suitable definition of a zero 
is itself a difficult task, so this classical unstable cancellation rule was not easily extended. 
An extensive discussion of internal stability and state-space representations can be found 
in Kailath [Kai80, p175] or Callier and Desoer [CD82a]. 

Desoer and Chan's definition appears in [DC 75]. Their definition has been widely used 
since, e.g., in [Fra87, P15-17] and [Vid85, P99-108]. 

Parametrization for Stable Plants 

The parametrization given in (7.21) appears for example in the articles [Zam81], [DC81a], 
[BD86], and chapter 8 of Callier and Desoer [CD82a]. In process control, the parametriza- 
tion is called the internal model principle, since the controller K in figure 7.9 contains a 
model of P yu \ see Morari and Zafirou [MZ89, CH.3]. 

Parametrization via Interpolation Conditions 

An early version of the interpolation conditions appears in Truxal's 1955 book [Tru55, 
P308-309]. There he states that if Po has an unstable zero, so should the closed- loop I/O 
transfer function T. He does not mention unstable plant poles, and his reasoning is not 
quite right (see [BBN90]). 

The first essentially correct and explicit statement of the interpolation conditions appears 
in a 1956 paper by Bertram on discrete-time feedback control [Ber56]. He states: 

In summary, for [the classical SASS 1-DOF control system], the following 
design restrictions must be considered. 

• Any zeros of the plant on or outside the unit circle in the z-plane must 
be contained in [T(z)]. 

• Any poles of the plant on or outside the unit circle in the z-plane must 
be contained in [1 — T(z)\. 

He does not explicitly state that these conditions are not only necessary, but also sufficient 
for closed-loop stability; but it is implicit in his design procedure. 

Another early exposition of the interpolation conditions can be found in chapter 7 of 
Ragazzini and Franklin's 1958 book [RF58, P157-158]. The equivalent interpolation 
conditions for continuous- time systems first appear in a 1958 paper by Bigelow [Big58]. 

A recent paper that uses the interpolation conditions is Zames and Francis [ZF83]. Inter- 
polation conditions for MAMS plants appear in [AS84]. 
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Parametrization for MAMS Plants 

The results on parametrization of achievable closed-loop transfer matrices in the multiple- 
actuator, multiple-sensor case depend on factorizations of transfer matrices. Early treat- 
ments use factorization of transfer matrices in terms of matrices of polynomials; see 
e.g., [Ros70] and [Wol74]; extensive discussion appears in [Kai80]. The first parametriza- 
tion of closed-loop transfer matrices that can be achieved with stabilizing controllers ap- 
pears in Youla, Jabr, and Bongiorno's articles on Wiener-Hopf design [YJB76, YBJ76]. 
For discrete-time systems, the parametrization appears in the book by Kucera [Kuc79]. 

A more recent version of the parametrization uses factorization in terms of stable trans- 
fer matrices, and appears first in Desoer, Liu, Murray and Saeks [DLM80]. The book 
by Vidyasagar [Vid85, CH.3,5] contains a complete treatment of the parametrization of 
achievable closed-loop transfer matrices in terms of stable factorizations. A state-space 
parametrization can be found in Francis [Fra87, CH.4] or Vidyasagar [Vid85]. 

Parametrization Using Observer-Based Controller 

The observer-based controller parametrization was first pointed out by Doyle [Doy84]; it 
also appears in Anderson and Moore [AM90, §9.2], Maciejowski [Mac89, §6.4], and a 
recent article by Moore, Glover, and Telford [MGT90]. 

Why We Hear So Much About Stability 

We mentioned in section 7.2.2 that any sensible set of design specifications will constrain 
the four critical transfer matrices more tightly than merely requiring stability. For example, 
the specifications may include specific finite limits on some norm of the four critical transfer 
matrices, such as 

H-Hui/illoo < an, ||flu„ a ||oo < ai2, \\H yvi ll°° < °2ii \\H yV2 ||oo < a 2 2, (7.42) 
whereas internal stability only requires that these norms be finite: 

H^^lloo < OO, H^^lloo < OO, H^^lloo < CO, ||i?y„ 2 ||oo < co. (7.43) 

In any particular problem, a design in which these transfer matrices are extremely large but 
stable is just as unacceptable as a design in which one or more of these transfer matrices 
is actually unstable. So in any particular problem, the "qualitative" (affine) specification 
of internal stability (7.43) will need to be replaced by a stronger "quantitative" (convex 
but not affine) specification such as (7.42). 

We see so much discussion about the qualitative specification of internal stability for 
historical reasons. In Newton, Gould, and Kaiser [NGK57, p21] we find 

In the classical view, a feedback control problem could be identified almost 
always as a stability problem. To the early workers in the field, the problem 
of assuring stability was nearly always the foremost consideration. ... [A bad 
controller] caused the system to exhibit sustained oscillations of the output 
even though the input was quiescent. This phenomenon, often called hunting, 
so plagued the control engineer that even to the present time [1957] it has all 
but dwarfed the many other aspects of the feedback control problem. 

In Black [Bla34] we find 
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It is far from a simple proposition to employ feedback in this way because of 
the very special control required of phase shifts in the amplifier and circuits, 
not only throughout the useful frequency band but also for a wide range of 
frequencies above and below this band. Unless these relations are maintained, 
singing will occur, usually at frequencies outside the useful range. Once hav- 
ing achieved a design, however, in which proper phase relations are secured, 
experience has demonstrated that the performance obtained is perfectly reli- 
able. 



Chapter 8 

Performance Specifications 



In this chapter we consider in detail performance specifications, which limit the 
response of the closed-loop system to the various commands and disturbances 
that may act on it. We show that many of these performance specifications are 
closed-loop convex. 

We organize our discussion of performance specifications by their meaning or pur- 
pose (in the context of controller design), and not by the mathematical form of 
the constraints. In fact we shall see that performance specifications with different 
meanings, such as a limit on errors to commands, a minimum acceptable level of 
regulation, and a limit on actuator effort, can be expressed in similar forms as lim- 
its on the size of a particular submatrix of the closed-loop transfer matrix H. For 
this reason our discussion of these types of specifications will become briefer as the 
chapter progresses and the reader can refer back to chapter 5 or other specifications 
that have a similar form. 

To facilitate this organization of performance specifications by meaning, we par- 
tition the exogenous input vector w as follows: 

w c | n c commands 

Wd | n<± disturbances 

w etc other components of w 

The n c components of w c are the command, reference, or set-point signals — the 
"input" in classical control terminology. The n& components of w,i are the dis- 
turbance or noise signals. The vector signal w etc contains all remaining exogenous 
inputs — some of these signals will be discussed in chapter 10. 
We partition the regulated variables: 

z c | n c commanded variables 

Za I n a actuator signals 

z | n Q other critical signals 

z etc other components of z 



w = 



z = 
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The n c components of z c are the regulated variables that the commands w c are 
intended to control or regulate — the "output" in classical control terminology. The 
n a components of z a are the actuator signals, which we remind the reader must be 
included in z in any sensible formulation of the controller design problem. The n 
components of z are other critical signals such as sensor signals or state variables. 
The vector signal z etc contains all remaining regulated variables. 
We conformally partition the closed-loop transfer matrix H: 



Zc 
Za, 
Zo 
z etc 



H C c 


H c d 


* 


H ac 


H ac \ 


* 


Hoc 


H d 


* 


• 


* 


* 



w d 

Wetc 



The symbol * is used to denote a submatrix of H that is not used to formulate 
performance specifications (some of these submatrices will be used in chapter 10). 
In the next few sections we consider specifications on each of the other submatrices 
of H. We remind the reader that a convex or affine specification or functional on a 
submatrix of H corresponds to a convex or affine specification or functional on the 
entire matrix H (see section 6.2.3). 

8.1 Input/Output Specifications 

In this section we consider specifications on H cc , the closed-loop transfer matrix 
from the command or set-point inputs to the commanded variables, i.e. the variables 
the commands are intended to control. In classical control terminology, H cc consists 
of the closed-loop I/O transfer functions. Of course, it is possible for a control 
system to not have any command inputs or commanded variables (n c = 0), e.g. the 
classical regulator. 

The submatrix H cc determines the response of the commanded variables z c to 
the command inputs w c only; z c will in general also be affected by the disturbances 
(tUd) and the other exogenous input signals (w etc ) (these effects are considered in the 
next section). Thus, the signal H cc w c is the noise-free response of the commanded 
variables. In this section, we will assume for convenience that w^ = 0, w etc = 0, 
so that z c = H cc w c . In other words, throughout this section z c will denote the 
noise-free response of the commanded variables. 



8.1.1 Step Response Specifications 

Specifications are sometimes expressed in terms of the step response of H cc , espe- 
cially when there is only one command signal and one commanded variable (n c = 1). 
The step response gives a good indication of the response of the controlled variable 
to command inputs that are constant for long periods of time and occasionally 
change quickly to a new value (sometimes called the set-point). We first consider 
the case n c = 1. Let s(t) denote the step response of the transfer function H cc . 
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Asymptotic Tracking 

A common specification on H cc is 

lim s{t) = # cc (0) = 1, 

s— >oo 

which means that for w c constant (and as mentioned above, Wd = 0, w e tc = 0), 
z c (t) converges to w c as t — > oo, or equivalently, the closed-loop transfer function 
from the command to the commanded variable is one at s = 0. We showed in 
section 6.4.1 that the specification 

"^asymptJrk = {H \ H cc (0) = 1} 

is affine, since the functional 

<f>(H) = H cc (0) 

is affine. 

A strengthened version of asymptotic tracking is asymptotic tracking of order 
k: H cc (0) = 1, Hci (0) = 0, 1 < j < k. This specification is commonly encountered 
for k = 1 and k = 2, and referred to as "asymptotic tracking of ramps" or "zero 
steady-state velocity error" (for k = 1) and "zero steady-state acceleration error" 
(for k = 2). These higher order asymptotic tracking specifications are also affine. 

Overshoot and Undershoot 

We define two functionals of H cc : the overshoot, 

<t>os{H cc ) = sups(t) - 1, 

t>0 

and the undershoot, 

</>us(-Hcc) = SUp-s(t). 
t>0 

Figure 8.1 shows a typical step response and the values of these functionals. 
These functionals are convex, so the specifications 

H os = {H | c/>os(-ffcc) < a}, 

n us = {H I <f> us (H cc ) < a} 

are convex: for example, if each of two step responses does not exceed 10% overshoot 
(a = 0.1) then neither does their average (see section 6.4.2 and figure 6.6). 
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Figure 8.1 A typical step response s and its overshoot (4>os) and under- 
shoot (4> U s)- The asymptotic tracking specification 7^ a sym P t_trk requires that 
the step response converge to one as t — ► oo. 



The functionals <f) os and <f) us are usually used together with the asymptotic track- 
ing constraint 7^ a s y mpt_trk; otherwise we might use the relative overshoot and relative 
undershoot defined by 



<f>ros{H C c) = < 



0rus(-ffcc) = < 



sups(i)/.ff cc (0)- 

t>0 

+ 00 

BUp-«(t)/flc C (0) 

t>0 

+ 00 



ifHcc(O) >0, 

ifffcc(O) <0, 

ifffcc(O) >0, 

ifffcc(O) <0. 



It is less obvious that the specifications 

7i IOS = {H | <f) IOS (H cc ) < a}, 
n ius = {H | ()) IUS {H CC ) <a} 



(8.1) 
(8.2) 



are convex. To see that the relative overshoot constraint 7i IOS is convex, we rewrite 
it as 

H IOS = {H | flcc(O) > 0, s(t) - (1 + a)-ffcc(O) < for all t > 0} . 

If H, He Tiros and < A < 1, then H x = XH + (1 - \)H satisfies #acc(0) > 0, 
and for each t > we have s(t) — (1 + a)i?A CC (0) < 0. Hence, H\ £ Ti TOS - 
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Since the functional inequality specifications (8.1-8.2) are convex for each a, 
the relative overshoot and relative undershoot functionals are quasiconvex; they are 
not, however, convex. If one step response, s(t), has a relative overshoot of 30%, 
and another step response s(t) has a relative overshoot of 10%, then their average 
has a relative overshoot not exceeding 30%; but it may exceed 20%, the average 
of the two relative overshoots. An example of two such step responses is shown in 
figure 8.2. 




Figure 8.2 The relative overshoot of the step responses s and s are 30% 
and 10% respectively. Their average, (s + s)/2, has a relative overshoot of 
23%. This example shows that relative overshoot is not a convex functional 
of H. It is, however, quasiconvex. 



Rise Time and Settling Time 

There are many definitions of rise time and settling time in use; we shall use 

</>rise(#cc) = inf{T | s{t) > 0.8 for t > T}, 
Nettie (-ffcc) = inf{T | \s{t) - 1| < 0.05 for t > T}, 

as illustrated in figure 8.3. The functional (/> r i se is usually used together with the 
asymptotic tracking specification ?^asympt_trk; we can also define relative or normal- 
ized rise time. 

The functional inequality specifications 

'trise — 1-" | v'rise \-^ cc ) _ -^maxj 5 
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9 10 



t 



Figure 8.3 The value of the rise-time functional, <f> T i se , is the earliest time 
after which the step response always exceeds 0.8. The value of the settling- 
time functional, se ttie, is the earliest time after which the step response is 
always within 5% of 1.0. 



^settle — \H | </>settle(-£fcc) < ^max} 

are convex: if two step responses each settle to with 5% within some time limit 
T mai , then so does their average. Thus, the rise-time and settling-time functionals 
r i S e and (/>settie are quasiconvex, i.e., 

(/) rise (Ai? cc + (1 - \)H CC ) < max{(/> ris e(i?cc), c/w {H cc )} 

for all < A < 1, but we do not generally have 

^rise(Ai?cc + (1 - A)-ff cc ) < A(/> rise (-ffcc) + (1 - A)0 rise (.ff cc ), 

so they are not convex. Figure 8.4 demonstrates two step responses for which this 
inequality, with A = 0.5, is violated. 

We mention that there are other definitions of rise-time functionals that are not 
quasiconvex. One example of such a definition is the time for the step response to 
rise from the signal level 0.1 (10%) to 0.9 (90%): 

</> 10 -9o(#cc) = inf{< | s{t) > 0.9} - inf{t | a{t) > 0.1}. 

While this may be a useful functional of H cc in some contexts, we doubt the utility 
of the (nonconvex) specification <f>io-9o{H cc ) < T maX ) which can be satisfied by 
a step response with a long initial delay or a step response with very large high 
frequency oscillations. 



8.1 Input/Output Specifications 



177 




t 

Figure 8.4 The rise times of the step responses s and s are 5.0 and 0.8 
seconds respectively. Their average, (s + S)/2, has a rise time of 3.38 > 5.8/2 
seconds. This example shows that rise time is not a convex functional of H, 
although it is quasiconvex. 



General Step Response Envelope Specifications 

Many of the step response specifications considered so far are special cases of general 
envelope constraints on the step response: 

n env = {H | s min {t) < a{t) < s max (i) for all t > 0} , (8.3) 

where s m i n (t) < s max (t) for all t > 0. An example of an envelope constraint is shown 
in figure 8.5. The envelope constraint W env is convex, since if two step responses lie 
in the allowable region, then so does their average. 

We mention one useful way that a general envelope constraint 7i en v can be 
expressed as a functional inequality specification with a convex functional. We 
define the maximum envelope violation, 



<t>. 



max_env 



_vioi(#cc) = supmax{s(t) - s r 
t>o 



<(*), «mln(*) - «(*), 0}. 



The envelope specification 7i env can be expressed as 

'tenv — 1-" | ^max_env_violl,-"cc ) _ U j . 



General Response-Time Functional 

The quasiconvex functionals (/> r i se and (/> se ttie are special cases of a simple, general 
paradigm for measuring the response time of a unit step response. Suppose that we 
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0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

t 

Figure 8.5 General envelope specification on a step response. The two step 
responses «i and «2 satisfy this constraint; their average (shown in dashed 
line) also satisfies the envelope constraint. 



have upper and lower bounds for a general envelope constraint, s m ax(£) and s m in(*)- 
Suppose that s m ax(£) does not increase, and s m in(£) does not decrease, for increasing 
t. For each T > 0, we consider the time-scaled envelope specification 



Smin{t/T) < s{t) < s max (t/T) for all t > 0. 



(8.4) 



(8.4) defines a nested family of convex specifications, parametrized by T. For T = 1 
we have the original envelope specification with bounds s m i n and s max ; for T > 1 
we have a weaker specification, and for T < 1 we have a stronger specification; 
roughly speaking, T is the normalized response time. We define the generalized 
response-time functional as 

</> g rt(#cc) = inf {T | s min {t/T) < s{t) < a max {t/T) for all t > 0} . 

This construction is shown in figure 8.6. The comments at the end of section 6.2.2 
show that (/> grt is quasiconvex. 



Step Response Interaction 

We now consider the case where there are multiple commands (and multiple com- 
manded variables) so that H cc is an n c x n c transfer matrix, where n c > 1. Its 
diagonal entries are the transfer functions from the command inputs to their associ- 
ated commanded variables, which may be required to meet the various specifications 
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Smin(</0.4) 
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t 
Figure 8.6 A step response is shown together with the envelopes s m i n (t/T) 
and s m ai(i/^) f° r three values of T, where s m in(t) = 1 — 1.2 exp— t and 
*max(£) = 1 + exp —t. For T = 0.2 the step response just lies inside the 
envelopes, so the value of <f> gTt is 0.2. 



discussed above, e.g., limits on overshoot or rise time. The off-diagonal entries of 
H cc are the transfer functions from the commands to other commanded variables, 
and are called the command interaction transfer functions. It is generally desir- 
able that these transfer functions be small, so that each command input does not 
excessively disturb the other commanded variables. 

Let s(t) denote the step response matrix of H cc . One mild constraint on com- 
mand interaction is asymptotic decoupling: 



?4svmpt_dcpi = \H \ .ffcc(O) = lim s(i) is diagonal \ 

l I t— >oo J 



(8.5) 



This specification ensures that if the commands are constant, then the effect on 
each commanded variable due to the other commands converges to zero: there is 
no steady-state interaction for constant commands. 

A stronger specification that limits command interaction is an envelope con- 
straint on each entry of s(t), 



H 



mimo_env 



= {H | s min {t) < s{t) < s max (i) for all t > 0} , 



(8.6) 



where s m in(£) and s m ax(£) are matrices, and the inequalities in (8.6) are component 
by component. The envelope specification 7^ m imo_env is convex. 

An example of 7^ m imo_env is shown in figure 8.7, along with a step matrix s(t) 
that meets it. Of course, the responses shown in figure 8.7 are for steps applied to 
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(c) (d) 

Figure 8.7 Design specifications requiring the decoupling of responses to 
step commands. 



the inputs one at a time. Figure 8.8 shows the response of the commanded variables 
to a particular command signal that has a set-point change in wi at t = 0.5 and 
then a set-point change in W2 at t = 2.0. The perturbation in z^ right after t = 0.5 
and the perturbation in z\ right after t = 2.0 are due to command interaction. 
The specification 7^ m imo_env limits this perturbation, and guarantees that after the 
set-point change for z 2 , for example, the effect on Z\ will fade away. 

An extreme form of W m i mo ^nv is to require that the off-diagonal step responses 
be zero, or equivalently, that H cc be diagonal. This is called exact or complete 
decoupling: 



"ftdcpi = {H | H cc is diagonal } , 



This specification forbids any command interaction at all, regardless of the com- 
mand signals. T^dcpi is affine. 
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* 

Figure 8.8 An example of command interaction in a two-input, two-output 
system. The individual step responses of the system are shown in figure 8.7. 
Output z\ tracks a step change in the command signal w\. However, z\ 
is perturbed by a step change in the command signal wi- Similarly, Z2 is 
perturbed by a step change in w\ . 



Miscellaneous Step Response Specifications 

Other specifications often expressed in terms of the step response of H cc include 
monotonia step response and I/O slew-rate limits. The monotonic step response 
constraint is 

Wmonojr = {H \ s(t) is nondecreasing for all t > 0} 
= {H | h{t) > for all* > 0} 

where h{t) is the impulse response of H cc . Thus, Ttmono^r requires that the com- 
manded variable move only in the direction of the new set-point command in re- 
sponse to a single abrupt change in set-point. ?*mono_sr is a stricter specification 
than requiring that both the undershoot and overshoot be zero. ?*mono_sr is convex. 
A related specification is a slew-rate limit on the step response: 



H 



slew_sr 



-{ 



H 



d , * 
dt S ^ 



= \h(t)\ < M s ie W for all * > 



Response to Other Inputs 

We close this section by noting that the response of z c to any particular command 
signal, and not just a unit step, can be substituted for the step response in all of 
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the specifications above. This specific input tracking requirement is convex. 

For example, we may require that in response to the particular command signal 
shown in figure 8.9, the commanded variable z c lies in the envelope 



w c = w 



Pgmj 



shown. w pgm might represent an often repeated temperature cycle in an industrial 
oven (the mnemonic abbreviates "program"). The specification 



n 



pgm_trk 



= {H 



|-£*cc^pgm 



W 



pgm||co 



<30}, 



shown in figure 8.9, requires that the actual temperature, z c , always be within 30° C 
of the commanded temperature, w c . 



O 




t (hours) 
Figure 8.9 An example of a temperature command signal that might be 
used in a plastics process. Powder is slowly melted, and then sintered for 
3 hours. It is then rapidly cooled through the melt point. The envelope 
constraints on the actual temperature require the temperature error to be 
less than 30° C. 



8.1.2 Tracking Error Formulation 

Step response specifications constrain the response of the system to specific com- 
mands: a step input at each command. By linearity and time-invariance, this 
constrains the response to commands that are constant for long periods and change 
abruptly to new values, which is sometimes a suitable model for the commands that 
will be encountered in practice. In many cases, however, the typical command sig- 
nals are more diverse — they may change frequently in a way that is not completely 
predictable. This is often the case in a command-following system, where the goal 
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is to have some system variables follow or track a continuously changing command 
signal. 

In the tracking error formulation of I/O specifications, we define the tracking 

error as e tr k = z c —w c : the difference between the actual response of the commanded 
variables and the commands, as shown in figure 8.10. 



J U 

' < u 



W c 

Wd 

tc 




> z 



Figure 8.10 An architecture for expressing I/O specifications in terms of 
the tracking error e tr k = z c — w c . 

We will assume that e tr k is available as a part of z, and i?trk will denote the 
submatrix of H that is the closed-loop transfer matrix from the commands w c to 
the tracking errors e tr t. A general tracking error specification has the form 



"%rk = {H 



ll#trk|| 



trk_err 



<« }, 



(8.7) 



i.e., the closed- loop transfer matrix from commands to tracking error should be 
small, as measured with the norm || • ||trk_err- Using any of the norms from chapter 5 
allows a wide variety of I/O specifications to be formed from the general tracking 
error specification ?^trk> all of which are convex. We will briefly list some of these 
specifications and their interpretations. 



RMS Mistracking Limit 

One simplified model of the command signal is that it is a stochastic process with a 
known power spectrum 5 cm d- Of course this model is quite crude, and only intended 
to capture a few key features of the command signal, such as size and bandwidth: 
the command signal may in fact be generated by a human operator. If we accept 
this model, and take the RMS value as our measure of the size of the tracking 
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error, the general tracking specification (8.7) can be expressed as the weighted H2 
norm-bound 

ftrms_trk = {H \ \\H tA W\\ 2 < a } , (8.8) 

where W is a spectral factor of S cm d'- S cm d{w) = W{jw)*W{jw). 

Worst Case RMS Mistracking Limit 

We may reduce our a priori assumptions about the command signal even further, 
by assuming that we do not know the spectrum, but know only a maximum Wcmd- 
weighted RMS value for the command signal w c , where Wcmd is some appropriate 
weight. If our measure of the size of the tracking error e tr k is the worst case W tT k- 
weighted RMS value it might have, where Wt r k is some appropriate weight, then the 
appropriate norm in the general tracking error specification (8.7) is the weighted 
Hqo norm: 

"fthinf_trk = {H | llWtrfc-fftrkWcmdlloo < « }• (8.9) 

For n c = 1 (meaning the weights are scalar, and the maximum singular value of the 
transfer function is simply its magnitude), this specification can also be cast in the 
more classical form: 

fthinf_trk = {H I \H tA (JLu)\ < Ztrk(w), -ff tr k is stable } (8.10) 

where 

Ztrk(w) = 



W cmd (jw)W tlk (jw)\ 

The classical interpretation is that Ztrk( w ) is a frequency-dependent limit on the 
tracking error transfer function, and the specification (8.10) ensures that the "com- 
mand to tracking error transfer function is small at those frequencies where the 
command has significant energy". An example is shown in figure 8.11. 

Worst Case Peak Mistracking Limit 

Another specific form that the general tracking error specification (8.7) can take is 
a worst case peak mistracking limit: 

"ftpk_trk = {H | ||i?trk||pk_gn < «}• (8.11) 

This specification arises as follows. We use an unknown-but-bounded model of the 
command signals: we assume only 

IKHoo <Af. (8.12) 
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1000 



/ (Hz) 
Figure 8.11 Upper bounds on frequency response magnitudes are convex. 
The bound l(w) on the tracking error transfer function ensures that the 
tracking error transfer function is below — 20dB at frequencies below 10Hz, 
and rolls off below 1Hz. Two transfer functions -Htrk and -H t rk that satisfy 
the specification (8.10) are shown, together with their average. Of course, 
the magnitude of (-Htrk + -H tr k)/2 is not the average of the magnitudes of 
.fftrk and -Htrk, although it is no larger than the average. 



Our measure of the tracking error is the worst case peak (over all command signals 
consistent with (8.12)) of the tracking error e tr t: 



|| e trk||<x> < -^trk whenever w c satisfies (8.12) . 

This constraint is precisely (8.11), with a = M^/M. 

Since most plants are strictly proper, ||-fftrk||pk_gn will usually be at least one. 
This can be seen from the block diagram in figure 8.10: a step change in the 
command input w c will produce an immediate, equal sized change in the tracking 
error. After some time, the closed-loop system will drive the tracking error to a 
smaller value. For this reason, the specification (8.11) may not be useful. 

A useful variation on this worst case peak tracking error limit is to assume more 
about the command signals, for example, to assume a maximum slew rate as well 
as a maximum peak for the command signal. In this case the appropriate norm in 
the general tracking error limit (8.7) would be the worst case norm || • || wc , from 
section 5.2.4. 

For example, consider the temperature response envelope shown in figure 8.9. 
Provided the system achieves asymptotic tracking of constant commands, so that 
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oc< 



iJ trk (0) = 0, we can ignore the offset in the command signal. Since ||u; c — 250| 
150°C and ||tu c ||oc < 300°C/Hr, we can specify 

"ftpk_trk_slew = {H | ||-fftrk||wc < 30°C}, (8.13) 

where we take M am pi = 150° C and M s i ew = 300°C/Hr in the definition of the 
norm || • || wc (see section 5.2.4). The specification (8.13) is tighter than the envelope 
specification in figure 8.9: the specification (8.13) requires a peak tracking error of 
no more than 30° C for any command input that is between 100° C and 400° C, and 
slew limited by 300°C/Hr, while the specification in figure 8.9 requires the same 
peak tracking error for a particular input that is between 100° C and 400° C, and 
slew limited by 300°C/Hr. 

8.1.3 Model Reference Formulation 

An extension of the tracking error formulation consists of specifying a desired closed- 
loop I/O transfer matrix i?ref_des> called the reference or model transfer matrix, 
and the goal is to ensure that H cc « iJ re f_des- Instead of forming the tracking error 
as e tr t = z c — w c , we form the model reference error e mre = z c — H Te {^ es w c : the 
difference between the actual response z c and the desired response, H Te f_,i es w c . This 
is shown in figure 8.12. Note that the tracking error is just the model reference error 
when the model transfer matrix is the identity. 



{w 
w, 
Wet 



W c 
W ^ Wd 

etc 




> z 



Figure 8.12 An architecture for expressing I/O specifications in terms of 
the error from a desired transfer matrix H Te f_d es . 



We will assume that the model reference error e mre is contained in z. Let -ff mre 
denote the submatrix of H that is the closed- loop transfer matrix from the command 
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signal w c to the model reference error e mre . In the model reference formulation of 
I/O specifications, we constrain -ff mre to be small in some appropriate sense: 

Wmre = {H | 1 1 -ffmre 1 1 mre < « }• (8-14) 

The general model reference error specification (8.14) can take a wide variety of 
forms, depending on the norm used; we refer the reader to section 8.1.2 for a partial 
list, and chapter 5 for a general discussion. 



8.2 Regulation Specifications 

In this section we consider the effect on z c of w,i only, just as in the previous 
sections we considered the effect on z c of the command inputs only. The response 
of commanded variables to disturbances is determined by the closed-loop submatrix 
H c d] regulation specifications require that H c< ± be "small". It is not surprising, then, 
that regulation specifications can usually be expressed in the form of norm-bound 
inequalities, i.e., 

||-Hcd||reg < ", (8.15) 

where || • || reg is some appropriate norm that depends, for example, on the model 
of the disturbances, how we measure the size of the undesired deviation of the 
commanded variables, and whether we limit the average or worst case deviation. 

In the following sections we describe a few specific forms the general regulation 
specification (8.15) can take. Because these specifications have a form similar to 
I/O specifications such as limits on tracking error or model reference error, we will 
give a briefer description. For convenience we shall assume that Wd is a scalar 
disturbance and w c is a single commanded variable, since the extension to vector 
disturbance signals and regulated variables is straightforward. 

8.2.1 Rejection of Specific Disturbances 

The simplest model for a disturbance is that it is constant, with some unknown 
value. The specification that this constant disturbance be asymptotically rejected 
at z c is simply 

^asympt_rej = {H \ H cd (0) = 0} . 

This specification has the same form as the asymptotic decoupling specification (8.5) 
(and is therefore closed- loop affine), but it has a very different meaning. The speci- 
fication ?^asympt_rej can be tightened by limiting the step response of H c d to lie in a 
given envelope, as in the command response specifications discussed in section 8.1. 
For example, we may require that the effect of a unit step input at w,i on z c should 
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decay to no more than 0.05 within some given time T re j. Such a specification en- 
sures that the closed-loop system will counteract the effects of a rapidly applied (or 
changed) constant disturbance on the commanded variable. 

In most cases, however, disturbances cannot be so easily described. In the next 
few sections we discuss specifications that limit the effect of disturbances about 
which less is known. 

8.2.2 RMS Regulation 

A common model for a disturbance is a stochastic process with a known power 
spectral density Sdist- The specification 

ftrms^eg = {H \ \\H cd W\\ 2 < a} , (8.16) 

where W is a spectral factor of 5dist> limits the RMS deviation of the commanded 
variable (due to the disturbance) to be less then a. This specification has exactly 
the same form as the RMS mistracking limit (8.8): a weighted H2 norm-bound. 

The power spectral density of the disturbance is rarely known precisely; S^ist 
is usually meant to capture only a few key features of the disturbance, perhaps its 
RMS value and bandwidth. The power spectral density 

2a Wbw 

5dist(w) - ^t<:' 

for example, might be used to model a disturbance with an RMS value a and a 
bandwidth Wb w - 

8.2.3 Classical Frequency Domain Regulation 

We may not be willing to model the disturbance with a specific power spectral 
density. Instead, we may model Wd as having an unknown power spectral density, 
but some given maximum RMS value. A limit on the worst case RMS response of 
z c can be expressed as the H^ norm-bound 

"fthinf_reg = {H | ||i? c d||oo < <* }> 

which limits the RMS gain of the closed- loop transfer function H cd . Often, this 
specification is modified by frequency domain weights, reflecting the fact that either 
a maximum possible weighted-RMS value for the disturbance is assumed, or a limit 
on some weighted- RMS value of the commanded variable must be maintained. Such 
a frequency-weighted H^ norm-bound can be cast in the more classical form: 

"fthinf_re g = {H \ \H cd (ju)\ < Z reg (w), H cd is stable }. (8.17) 

The classical interpretation is that Z r eg(w) is a frequency-dependent limit on the 
disturbance to commanded variable transfer function, and the specification (8.17) 
ensures that the "disturbance to commanded variable transfer function is small at 
those frequencies where the disturbance has significant energy". 
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8.2.4 Regulation Bandwidth 

The classical frequency domain regulation specification (8.17) is often expressed as 
minimum regulation bandwidth for the closed-loop system. One typical definition 
of the regulation bandwidth of the closed-loop system is 

</>bw(#cd) = sup{fi | \H cd {ju)\ < 0.1 for all u < ft} , 

which is the largest frequency below which we can guarantee that the disturbance 
to commanded variable transfer function is no more than — 20dB, as shown in 
figure 8.13. 
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Figure 8.13 The value of the regulation bandwidth functional, 0b w , is 
the largest frequency below which the disturbance to commanded variable 
transfer function, H c d, is no more than — 20dB. 

The minimum bandwidth specification 

"^min_bw = {H | </>bw(-ff C d) > ftmin} 

is convex, since it is a frequency-dependent bound on the magnitude of H, so 
the bandwidth functional (/>b w is quasiconcave, meaning that — (/>bw is quasiconvex. 
Alternatively, we note that the inverse of the regulation bandwidth, i.e., l/(/>b W ) 
is quasiconvex. The inverse bandwidth l/(/>bw can be interpreted as a regulation 
response time. 

A generalized definition of bandwidth, analogous to the generalized response 
time, is given by 



</> g bw(#cd) = sup {ft | \H cd {jw)\ < M{w/Q) for all w} , 
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where M(w) is a non-decreasing frequency-dependent magnitude bound. 

8.2.5 Worst Case Peak Regulation 

If we model the disturbance as unknown-but-bounded, say, ||iUd||oo < Md, and 
require that the worst case peak deviation of the commanded variable due to w^ is 
less than M max _ P k_reg, i-e., 

||-ffcd^d||oo < -M"max_pk_reg whenever ||u\i||oo < M d , 
then we can specify the peak-gain bound 

"ftpk_dis = {H I ||-ff c d||pk_gn < -^max_pk_reg/-^d } • 

8.3 Actuator Effort 

In any control system the size of the actuator signals must be limited, i.e., 

IMIact < M act 

for some appropriate norm || • || act and limit M act . Reasons include: 

• Actuator heating. Large actuator signals may cause excessive heating, which 
will damage or cause wear to the system. Such constraints can often be 
expressed in terms of an RMS norm of u, possibly with weights (see sec- 
tion 4.2.2). 

• Saturation or overload. Exceeding absolute limits on actuator signals may 
damage an actuator, or cause the plant P to be a poor model of the system 
to be controlled. These specifications can be expressed in terms of a scaled or 
weighted peak norm of u. 

• Power, fuel, or resource use. Large actuator signals may be associated with 
excessive power consumption or fuel or resource use. These specifications are 
often expressed in terms of a scaled or weighted average-absolute norm of u. 

• Mechanical or other wear. Excessively rapid changes in the actuator signal 
may cause undesirable stresses or excessive wear. These constraints may be 
expressed in terms of slew rate, acceleration, or jerk norms of u (see sec- 
tion 4.2.8). 

These limits on the size of u can be enforced by limiting in an appropriate way 
the size of H ac and H a d, the closed- loop transfer matrices from the command and 
disturbance signals to the actuator. For example, if the command signal is modeled 
as a stochastic process with a given power spectral density, then a weighted H2 
norm-bound on H ac will guarantee a maximum RMS actuator effort due to the 
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command signal. If the command signal is modeled as unknown-but-bounded, and 
the peak of the actuator signal must be limited, then the actuator effort specification 
is a limit on the peak gain of H ac . These specifications are analogous to many we 
have already encountered in this chapter. 

We mention one simple but important distinction between a limit on the size of 
u and the associated limit on the size of H ac (or H a< $). We will assume for simplicity 
that the command and actuator signals are scalar, and the disturbance is negligible. 
Suppose that we have the constraint 

IMIoc < 1 (8.18) 

on our actuator signal (perhaps, an amplifier driving a DC motor saturates), and 
our command signal is constant for long periods of time, and occasionally changes 
abruptly to a new set-point value between —1 and 1. We can ensure that (8.18) 
holds for all such command signals with the closed-loop convex design specification 

2||i?ac||pk_step < 1- (8.19) 

This specification ensures that even with the worst case full-scale set-point changes, 
from — 1 to 1 and vice versa, the peak of the actuator signal will not exceed one. 
By linearity, the specification (8.19) ensures that a set-point change from —0.6 to 
0.4 will yield an actuator signal with ||u||oo < 0.6. Roughly speaking, for such a 
set-point change we are only making use of 60% of our allowable actuator signal 
size; this may exact a cost in, say, the time required for the commanded variable to 
converge to within 0.01 of the final value 0.4. 

This is illustrated in figure 8.14, which shows two command signals and the 
associated actuator signals in a control system that satisfies the specification (8.19). 
The command signal w in figure 8.14(a) is one of the worst case, full-scale set- 
point changes, and causes the actuator signal u, shown in figure 8.14(b), to nearly 
saturate. The command w in figure 8.14(c), however, results in the actuator signal 
in figure 8.14(d), which uses only 48% of the allowable actuator capability. 

8.4 Combined Effect of Disturbances and Commands 

So far we have treated command inputs and disturbances separately; the specifica- 
tions we have seen constrain the behavior of the closed-loop system when one, but 
not both, of these exogenous inputs acts. As a simple example, assume that the 
system has a single command input, a single disturbance, and a single actuator, so 
that n c = na = n a = 1. Consider the two specifications 

n env = {H | s min (t) < s(t) < s max (<) for all t > 0} , 

^rmsjct = \H | H-ffadlh < !}• 

The first specification requires that the step response from w c to z c lie inside the 
envelope given by s m i n and s max , as in figure 8.5. This means that the commanded 
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Figure 8.14 The specification (8.19) ensures that the actuator signal mag- 
nitude will not exceed one during set-point changes in the range —1 to 1. 
The input w in (a) shows a set-point change that drives the actuator sig- 
nal, shown in (b), close to its limit. Because of linearity, smaller set-point 
changes will result in smaller actuator effort: the set-point change w in (c) 
produces the actuator signal u in (d), which only uses 48% of the available 
actuator effort. 
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variable z c will lie in the given envelope when a step input is applied to w c and the 
disturbance input is zero. The second specification requires that the RMS value of 
the actuator u be less than one when white noise is applied to w,i and the command 
input is zero. If a unit step command input is applied to w c and a white noise is 
applied to w,i simultaneously, the response at z c is simply the sum of the responses 
with the inputs acting separately. It is quite likely that this z c would not lie in the 
specified envelope, because of the effect of the disturbance w<i; similarly the RMS 
value of u would probably exceed one because of the constant component of the 
actuator signal that is due to the command. 

This phenomenon is a basic consequence of linearity. These separate specifica- 
tions often suffice in practice since each regulated variable may be mostly dependent 
on either the command or disturbances. For example, in a given system the distur- 
bance may be small, so the component of the actuator signal due to the disturbance 
(i.e., iJ a( jWd) ma y be much smaller than the component due to the command (i.e., 
H ac w c ); therefore an actuator effort specification that limits the size of H ac will 
probably acceptably limit the size of u, even though it "ignores" the effect of the 
disturbance. 

We can also describe this phenomenon from a more general viewpoint. Each 
specification we have considered so far is a specification on some submatrix of H 
that does not contain all of its columns, and therefore considers the effects of only a 
subset of the exogenous inputs. In contrast, a specification on a submatrix of H that 
contains all of its columns will consider the effects of all of the exogenous inputs, 
acting simultaneously. For example, the RMS actuator effort specification 7^ rms- act 
involves only the submatrix H a &, an d does not consider the effect of the command 
on the RMS value of the actuator signal. On the other hand, the specification 



^rmsjict.cmb = { H y ||i?ad|| 2 + #ac(0) 2 <1 >, 



which is a specification on the bigger submatrix [H ac H a ^\ oiH, correctly guarantees 
that the RMS value of u will not exceed one when the command is a unit step and 
the disturbance is a white noise (and, we should add, w etc = 0). 

This discussion suggests that a general actuator effort specification should really 
limit the size of the transfer matrix [H ac H a ^\. Limiting the sizes of H ac and H a & 
separately will, of course, limit the size of [H ac H a< i]; this corresponds to a prior 
allocation of actuator effort between regulation and command-following tasks. 

In cases where different types of models for the commands and disturbances are 
used (or indeed, different types of models for different components of either), it 
can be difficult or cumbersome to formulate a sensible specification on the bigger 
submatrix of H. Returning to our example, let us form a specification on the 
response of z c that considers both a unit step at w c (a particular signal) and a 
white noise at w,i (a stochastic process). A possible form for such a specification 
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might be 

ftenv_cmb = {H | Prob (s min (<) < z c {t) < s max (t) for < t < 10) > 0.90} , 

where z c = H cc w c + H^Wd, w c is a unit step, and w,i is a white noise. Roughly 
speaking, the specification W enT _ cm b requires at least a 90% probability that the 
envelope constraint W env be satisfied over the first ten seconds. "W e nv_cmb limits the 
effect the (stochastic) disturbance can have during a command step input; however, 
we do not know whether W enT _ cm b is convex. 



Chapter 9 

Differential Sensitivity 
Specifications 



In this chapter we consider specifications that limit the differential sensitivity 
of the closed- loop transfer matrix with respect to changes in the plant. For 
certain important cases, these specifications are closed-loop convex. The most 
general specification that limits differential sensitivity of the closed-loop system 
is, however, not closed-loop convex. 

In the previous chapter we considered various specifications that prescribe how the 
closed-loop system should perform. This included such important considerations 
as the response of the system to commands and disturbances that may affect the 
system. In this chapter and the next we focus on another extremely important 
consideration: how the system would perform if the plant were to change. 

Many control engineers believe that the primary benefits of feedback are those 
considered in this chapter and the next — robustness or insensitivity of the closed- 
loop system to variations or perturbations in the plant. Prom another point of 
view, the performance of a control system is often limited not by its ability to meet 
the performance specifications of the previous chapter, but rather by its ability to 
meet the specifications to be studied in this chapter and the next, which limit the 
sensitivity or guarantee robustness of the system. 

There are several general methods that measure how sensitive the closed-loop 
system is to changes in the plant: 

• Differential sensitivity: the size of the derivative of H with respect to P. 

• Worst case perturbation: the largest change in H that can be caused by a 
certain specific set of plant perturbations. 

• Margin: the smallest change in the plant that can cause some specification to 
be violated. 
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In this chapter we consider the first method; the other two are discussed in the next 
chapter. 

9.1 Bode's Log Sensitivities 

9.1.1 First Order Fractional Sensitivity 

H. Bode was the first to systematically study the effect of small changes in closed- 
loop transfer functions due to small changes in the plant. He considered the I/O 
transfer function T of the SASS 1-DOF control system (see section 2.3.2), 

T= P ° K 



1 + PoK 

He noted that for any frequency s, 
dT(s) I T{s) 1 



dP (s) J P (s) l+P (s)K(s) 



= S(s), (9.1) 



which gave the name and symbol S to the classical sensitivity transfer function. 
We thus have a basic rule-of-thumb for the 1-DOF control system, 

6 -^~S(s) 6 -^± (9 2) 

T(s) ~ S{S) P (s) (9 ' 2) 

(~ means equal to first order), which we interpret as follows: the fractional or 
relative change in the I/O transfer function is, to first order, the sensitivity transfer 
function times the fractional change in P . For example, and roughly speaking, at 
a frequency w with |S(.7U>)| = 0.1, a ten percent change in the complex number 
Po(jcv) yields a change in T(jw) of only (and approximately) one percent. 

An important consequence of (9.2) is that a design specification that limits the 
first order fractional change in the I/O transfer function with respect to fractional 
changes in P can be expressed as an equivalent closed-loop convex specification that 
limits the size of the sensitivity transfer function. For example, the specification 



6T{jw) 



T(jw) 



<0.05 



SPo{ju) 



Po{jo>) 



for a < Wbw, (9.3) 



(< means < holds to first order), which limits the first order fractional change in T 
to no more than 5% of the fractional change in Po for frequencies less than Wbwj is 
equivalent to the design specification 

\S(ju)\ < 0.05 for u < w bw , (9.4) 

which is closed-loop convex. 
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The precise meaning of (9.3) is 

li- ^SM/^M =|5(i«)|<0.05 for W < Wbw . (9.5) 

«p y«)-o T(ju) I Po(jw) 

In many cases, the limit in (9.5) is rapidly approached, i.e., the first order approxi- 
mation to the fractional change in T accurately predicts the actual fractional change 
in T due to a (non-differential) change in Po. We will see two examples of this in 
sections 9.1.3 and 9.1.4. 

9.1.2 Logarithmic Sensitivity 

We can express (9.1) as 

Sis) ~ 9l ° gT{s) (9 6) 

For this reason S{s) is called the logarithmic sensitivity of T with respect to Po. 
We must be careful about what (9.6) means. By log T(s) we mean 

logT( S )=log|T( S )|+jZT( S ) (9.7) 

where /-T{s) is a phase angle, in radians, of T{s). Whereas log |T(s)| is unambiguous 
and well-defined for all s for which T{s) ^ 0, the phase angle /-T{s) is ambiguous: 
it is only defined to within an integer multiple of 2n. On any simply-connected 
region in the complex plane on which T{s) ^ 0, it is possible to make a selection 
of particular phase angles in (9.7) at each s in such a way that /-T[s) is continuous 
on the region, and in fact log T{s) is analytic there. When this process of phase 
selection is applied along the imaginary axis, it is called phase unwrapping. 

In particular, if T(so) ^ 0, then in some small disk around T{so) in the complex 
plane, we can define log T{s) [i.e., choose the phase angles) so that it is analytic 
there. Moreover any two such definitions of log T{s) will differ by a constant multiple 
of 2ttj, and therefore yield the same result in the partial derivative in (9.6), evaluated 
at s - A similar discussion applies to the expression log P (s): while it need not 
make sense as an unambiguous function of the complex variable s over the whole 
complex plane, the result in (9.6) will nevertheless be unambiguous. 

The real part of (9.7), which is the natural log of the magnitude, has an uncom- 
mon unit in engineering, called nepers. One neper is the gain that corresponds to a 
phase angle of one radian, and is approximately 8.7dB. In more familiar units, one 
decibel corresponds to about 6.6 degrees of phase. 

(9.2) can be expressed more completely as 

Slog \T{s)\ ~ $tS{s)8log |P (s)| - 5S{s)8lP {s) (9.8) 

SlT{s) ~ 5S{s)8log |P (s)| + $tS{s)8lP {s). (9.9) 

These formulas show the first order change in the magnitude and phase of the I/O 
transfer function caused by magnitude and phase variations in P . 
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9.1.3 Example: Gain Variation 

Our first example concerns a change in gain, i.e. 

8P {s) = aP {s), (9.10) 

so that 

81ogP (jiv) ~ a. 

Hence from (9.8-9.9) we have 

61og\T{jw)\ ~ aXSiJw), (9.11) 

dLT(jw) ~ aSSSijw). (9.12) 

It follows, for example, that the closed-loop convex specification 

fStS{jw ) = 

guarantees that the magnitude of the I/O transfer function at the frequency w is 
first order insensitive to variations in a. 

To give a specific example that compares the first order deviation in |T(jw)| to 
the real deviation, we take the standard example plant and controller K ( a ) described 
in section 2.4, and consider the effect on |T(jw)| of a gain perturbation of a = 25%, 
which is about 2dB. Figure 9.1 shows: 



l?W | = 



P* tA {ju)KM{ju) 



\ + P* td {ju)K(-){ju) 
which is the nominal magnitude of the I/O transfer function; 

1.25P std (jw)K( a )(jw) 



\T^{jw)\ = 



1 + 1.25P std (iw)if( a )(jw) 
which is the actual magnitude with the 25% gain increase in Pg td ; and 

P-— W»)| = |r(»|exp (0.25» ( l + p ,,^ )KM{M )) , (9-13) 

which is the magnitude of the perturbed I/O transfer function predicted by the first 
order perturbation formula (9.11). For this example, the first order prediction gives 
a good approximation of the perturbed magnitude of the I/O transfer function, 
even for this 2dB gain change in Pg td . 
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Figure 9.1 When P std is replaced by 1.25P st< \ the I/O transfer function's 
magnitude changes from \T\ to |T pert |. |j' a PP rox | i s a first order approxima- 
tion of |T pert | computed from (9.13). In this example the effect of a plant 
gain change as large as 25% is well approximated using the differential sen- 
sitivity. 



9.1.4 Example: Phase Variation 

In this example we study the effects on T of a phase variation in Po, i.e., 

Po(jw) + 6P (jw) = e je ^P (jw). 
In this case we have, from (9.8-9.9), 

61og\T(Ju>)\ ~ -6{u>)SS(Ju), 

8lT{s) ~ 6{u)?ftS{ju). 



(9.14) 

(9.15) 



To guarantee that ^(jwo)! is, for example, first order insensitive to phase variations 
in P (jw ), we have the specification 

SS(jwo) = 0, 

which is closed-loop affine. 

We now consider a specific example that compares the actual effect of a phase 
variation in P to the effect predicted by the first order perturbational analy- 
sis (9.14). As above, our plant is the standard example described in section 2.4, 
together with the same controller K^ a \ The specific perturbed Pg td is 
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so that the phase perturbation is 



9{u) = 2 ( 



tan 



10 



tan 



.1 w\ 
5^' 



which is plotted in figure 9.2. The maximum phase shift of —38.9° corresponds to 
about 6dB of gain variation (see the discussion in section 9.1.2). 
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Figure 9.2 A specific phase shift in Pq* 



Figure 9.3 shows the nominal magnitude of T, the actual perturbed magnitude 
caused by the phase shift in Pg td , and the perturbed magnitude predicted by the 
first order analysis, 



| T approx (ja;)| = |T(ia;)|exp 



-0(w)3 



l+P* td (JLu)KW(JLu) 



(9.16) 



9.1.5 Other Log Sensitivities 

We have seen that the logarithmic sensitivity of the I/O transfer function T is given 
by another closed-loop transfer function, the sensitivity S. Several other important 
closed-loop transfer functions have logarithmic sensitivities that are also closed-loop 
transfer functions. Table 9.1 lists some of these. 

From the top line of this table we see that a specification such as 



dlog S(jw) 



d log P (jw) 



< 2 for w < Wbv 



(9.17) 
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Figure 9.3 When the phase factor (10 - s)/(10 + s) in P std is replaced 
by (5 — s)/(5 + s), the magnitude of the I/O transfer function changes from 
\T\ to |T pert |. |T approx | is a first order approximation of |T pert | computed 
from (9.16). 



H 


dlogH 
dlogP 


1 


-PqK 
l+P K 

-P K 

1+P K 

1 


1 + PoK 

K 


1 + P K 
Po 


1 + PoK 
P K 

1 + PoK 


1+PoK 
1 


1 + PoK 



Table 9.1 The logarithmic sensitivity of some important closed-loop trans- 
fer functions are also closed-loop transfer functions. In the general case, 
however, the logarithmic sensitivity of a closed-loop transfer function need 
not be another closed-loop transfer function. 
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is equivalent to the closed-loop convex specification 
-P {ju)K{ju 



\ + P {jw)K{jw) 



< 2 for (j < w bw . (9.18) 



In (9.17) we might interpret — S as the command input to tracking error transfer 
function, so that the specification (9.17), and hence (9.18), limits the logarithmic 
sensitivity of the command input to tracking error transfer function with respect to 
changes in Po. 

9.2 MAMS Log Sensitivity 

It is possible to generalize Bode's results to the MAMS case. We consider the 
MAMS 2-DOF control system (see section 2.3.4), with I/O transfer matrix 

T = {I + P K § )- 1 P K r . 
If the plant is perturbed so that P becomes P + 6P , we have 

T + 6T = {I + (Po + SPo)^)' 1 (Po + SPo)K r 
Retaining only first order terms in 6P we have 

6T ~ (I + P Q Ky)-HP K r -(1 + P Ky)-HP Ky{l + P Q K y ) " 1 P Q K p 
= {I + PoKy^SPoil + KyP^Kr 

= S6P {I + K^P )- x K r , (9.19) 

where S = (I + Po-ffg) -1 is the (output-referred) sensitivity matrix of the MAMS 
2-DOF control system. 

Now suppose that we can express the change in Po as 

6P = 6P frac P . 

We can interpret 6Pg rac as an output-referred fractional perturbation of Po, as shown 
in figure 9.4. Then from (9.19) we have 

8T ~ S8P^ iac P {I + K y P )- x K r = S8P£ ac T, 



so 


that 






8T~ 


8T iiac T, 


wl 


lere 

cfT-ifrac 


~ S8P^ iac 



(9.20) 

This is analogous to (9.2): it states that the output-referred fractional change in 
the I/O transfer matrix T is, to first order, the sensitivity matrix S times the 
output-referred fractional change in Pq. 
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Figure 9.4 An output-referred fractional perturbation of the transfer ma- 
trix P . In the SASS case, 6Po « 6logP . 



The design specification, 

<w(«* ac (ju;)) < 0.01 for u < w bw , a max (6P frac (ja;)) < 0.20, (9.21) 

which limits the first order fractional change in T to 1% over the bandwidth Wbwj 
despite variations in Pq of 20%, is therefore equivalent to the closed-loop convex 
specification 



o- m ax(S'(jw)) < 0.05 for u < w bv 



(9.22) 



We remind the reader that the inequality in (9.21) holds only to first order in 
^pfrac. ^ s p rec i se meaning is 

,. t7max(^T frac (jo;)) 0.01 

lim ; c — ; — rr < for ui < Wh w - 

6P ^ a max {6P^{jw)) ~ 0.20 

9.2.1 Cruz and Perkins' Comparison Sensitivity 

Cruz and Perkins gave another generalization of Bode's log sensitivity to the MAMS 
2-DOF control system using the concept of comparison sensitivity, in which the 
perturbed closed-loop system is compared to an equivalent open-loop system. 

The open-loop equivalent system consists of Pq + 6Po driven by the unperturbed 
actuator signal, as shown at the top of figure 9.5. For 6Po = 0, the open-loop 
equivalent system is identical to the closed-loop system shown at the bottom of 
figure 9.5. For SPo nonzero, however, the two systems differ. By comparing the first 
order changes in these two systems, we can directly see the effect of the feedback 
on the perturbation 6Po- 

The transfer matrix of the open-loop equivalent system is 



T ole = (P + Wo) (/ + KijPo)- 1 ^, 



so that 



ST ole = 6P {I + K i P )- 1 K r 
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Figure 9.5 In the open-loop equivalent system, shown at top, the actuator 
signal u drives Pq+SPq, so there is no feedback around the perturbation SPo- 
The benefit of feedback can be seen by comparing the first order changes in 
the transfer matrices from r to j/p le and 2/p ert , respectively (see (9.23)). 



Comparing this to the first order change in the I/O transfer matrix, given in (9.19), 
we have 



6T ~ S8T 



ole 



(9.23) 



This simple equation shows that the first order variation in the I/O transfer 
matrix is equal to the sensitivity transfer matrix times the first order variation in 
the open-loop equivalent system. It follows that the specification (9.22) can be 
interpreted as limiting the sensitivity of the I/O transfer matrix to be no more than 
5% of the sensitivity of I/O transfer matrix of the open-loop equivalent system. 



9.3 General Differential Sensitivity 

The general expression for the first order change in the closed-loop transfer matrix 
H due to a change in the plant transfer matrix is 



SH ~ 6P ZW + 6P ZU K{I - P yu K)~ 1 P yw + P ZU K(I 
+ P ZU K{I - P yu K)-HP yu K{I - PyuK)- 1 



- PyuK)- 

yw 



l 6P, 



yw 



(9.24) 
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The last term shows that dH/dP yu (which is a complicated object with four indices) 
has components that are given by the product of two closed-loop transfer functions. 
It is usually the case that design specifications that limit the size of dH/dP yu are 
not closed-loop convex, since, roughly speaking, a product can be made small by 
making either of its terms small. 



9.3.1 An Example 

Using the standard example plant and controller K ( a ), described in section 2.4, we 
consider the sensitivity of the I/O step response with respect to gain variations in 
P std . Since 6P std = aP£ td , the sensitivity 

s a (t) ± d3{t) 



9a a=0 

is simply the unit step response of the transfer function 

pstd jx-(a) 

(l+p-td^.jja = ST - ( 9 - 25 ) 

This transfer function is the product of two closed-loop transfer functions, which is 
consistent with our general comments above. 

In figure 9.6 the actual effect of a 20% gain reduction in Pg td on the step response 
is compared to the step response predicted by the first order perturbational analysis, 

s approx (f) = s(t) - 0.2s a (t), 

with the controller K^ a \ The step response sensitivity with this controller is shown 
in figure 9.7. For plant gain changes between ±20%, the first order approximation 
to the step response falls in the shaded envelope s(t) ± 0.2s a (t). 
We now consider the specification 

|* a (l)| < 0.75, (9.26) 

which limits the sensitivity of the step response at time t = 1 to gain variations in 
Pg td . We will show that this specification is not convex. 

The controller K^ a ' yields a closed-loop transfer matrix iP a ) with s« (1) = 
0.697, so .ff( a ) satisfies the specification (9.26). The controller K ( b ) yields a closed- 
loop transfer matrix H^ with s« (1) = 0.702, so H^ also satisfies the specifica- 
tion (9.26). However, the average of these two transfer matrices, (H^ + H^)/2, 
has a step response sensitivity at t = 1 of 0.786, so (H^ + H^)/2 does not satisfy 
the specification (9.26). Therefore the specification (9.26) is not convex. 



206 



Chapter 9 Differential Sensitivity Specifications 




Figure 9.6 When Po td is replaced by 0.8Po td , the step response changes 
from s to s pert . The first order approximation of s pert is given by s approx (£) = 
s(t) -0.2s a (t). 
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Figure 9.7 The sensitivity of the step response to plant gain changes is 
shown for the controller K^ a '. The first order approximation of the step 
response falls in the shaded envelope when Po td is replaced by aPo td , for 
0.8 < a < 1.2. 
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9.3.2 Some Convex Approximations 

In many cases there are useful convex approximations to specifications that limit 
general differential sensitivities of the closed-loop system. 
Consider the specification 

|*a(*)|<0.75 fort>0, (9.27) 

which limits the sensitivity of the step response to gain variations in P . This 
specification is equivalent to 

||5T|| pkjtep < 0.75, 

which is not closed-loop convex. We will describe two convex approximations for 
the nonconvex specification (9.27). 
Suppose 

«mln(*) < *(*) < Smax(t) for i > (9.28) 

is a design specification (see figure 8.5). A weak approximation of the sensitivity 
specification (9.26) (along with the step response specification (9.28)) is that a 
typical (and therefore fixed) step response satisfies the specification: 

l|STt yP || pkjtep < 0.75, 
where T typ is the transfer function that has unit step response 

S typ(<) = ^ • 

A stronger approximation of (9.27) (along with the step response specifica- 
tion (9.28)) requires that the sensitivity specification be met for every step response 
that satisfies (9.28): 

max {Halloo | s min {t) < v{t) < s max (f) for t > 0} < 0.75. 

This is an inner approximation of (9.27), meaning that it is tighter than (9.27). 
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Notes and References 

Feedback and Sensitivity 

The ability of feedback to make a system less sensitive to changes in the plant is discussed 
in essentially every book on feedback and control; see Mayr [May70] for a history of 
this idea. An early discussion (in the context of feedback amplifiers) can be found in 
Black [Bla34], in which we find: 

... by building an amplifier whose gain is deliberately made, say 40dB higher 
than necessary, and then feeding the output back on the input in such a 
way as to throw away the excess gain, it has been found possible to effect 
extraordinary improvement in constancy of amplification . . . By employing 
this feedback principle, amplifiers have been built and used whose gain varied 
less than O.OldB with a change in plate voltage from 240V to 260V [whereas] 
for an amplifier of conventional design and comparable size this change would 
have been 0.7dB. 

For a later discussion see Horowitz [Hor.63, Ch3]. A concise discussion appears in chapter 
1, On the Advantages of Feedback, of Callier and Desoer [CD82a]. 

Differential Sensitivity 

Bode [Bod45] was the first to systematically study the effect of small (differential) changes 
in closed- loop transfer functions due to small (differential) changes in the plant. On page 
33 of [Bod45] we find (with our corresponding notation substituted), 

The variation in the final gain characteristic [T] in dB, per dB change in the 
gain of [Po], is reduced in the ratio [S\. 

A recent exposition of differential sensitivity can be found in chapter 3 of Lunze [Lun89]. 

Comparison Sensitivity 

The notion of comparison sensitivity was introduced by Cruz and Perkins in [CP64]; 
see also the book edited by Cruz [Cru73]. The idea of an open- loop equivalent system, 
however, is older. In [NGK57, §1.7], it is called the equivalent cascade configuration of 
the control system. Recent discussions of comparison sensitivity can be found in Callier 
and Desoer [CD82A, CHl] and Anderson and Moore [AM90, §5.3]. 

Sensitivity Specifications that Limit Control System Performance 

The idea that sensitivity or robustness specifications can limit the achievable control system 
performance is explicitly expressed in, e.g., Newton, Gould, and Kaiser [NGK57, p23]: 

Control systems often employ mechanical, hydraulic, or pneumatic elements 
which have less reproducible behavior than high quality electric circuit ele- 
ments. This practical problem often causes the control designer to stop short 
of an optimum design because he knows full well that the parameters of the 
physical system may deviate considerably from the data on which he bases 
his design. 

A more recent paper that raised this issue, in the context of regulators designed by state- 
space methods, is Doyle and Stein [DS81]. 
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Robustness Specifications via 
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In this chapter we consider robustness specifications, which limit the worst case 
variation in the closed-loop system that can be caused by a specific set of plant 
variations. We describe a powerful method for formulating inner approximations 
of robustness specifications as norm-bounds on the nominal closed-loop transfer 
matrix. These specifications are closed-loop convex. 

In the previous chapter we studied the differential sensitivity of the closed-loop 
system to variations in the plant. Differential sensitivity analysis often gives a 
good prediction of the changes that occur in the closed-loop system when the plant 
changes by a moderate (non-vanishing) amount, and hence, designs that satisfy 
differential sensitivity specifications are often robust to moderate changes in the 
plant. But differential sensitivity specifications cannot guarantee that the closed- 
loop system does not change dramatically (e.g., become unstable) when the plant 
changes by a non- vanishing amount. 

In this chapter we describe robustness specifications, which, like differential sen- 
sitivity specifications, limit the variation in the closed-loop system that can be 
caused by a change or perturbation in the plant. In this approach, however, 

• the sizes of plant variations are explicitly described, e.g., a particular gain 
varies ±ldB, 

• robustness specifications limit the worst case change in the closed-loop system 
that can be caused by one of the possible plant perturbations. 

By contrast, in the differential sensitivity approach, 

• the sizes of plant variations are not explicitly described; they are vaguely 
described as "small", 
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• differential sensitivity specifications limit the first order changes in the closed- 
loop system that can be caused by the plant perturbations. 

Robustness specifications give guaranteed bounds on the performance deterio- 
ration, even for "large" plant variations, for which extrapolations from differential 
sensitivity specifications are dubious. Offsetting this advantage are some possible 
disadvantages of robustness specifications over differential sensitivity specifications: 

• It may not be possible to model the actual variations in the plant in the 
precise way required by robustness specifications. For example, we may not 
know whether to expect a ±ldB or a ±0.5dB variation in a particular gain. 

• It may not be desirable to limit the worst case variation in the closed-loop 
system, which results in a conservative design. A specification that limits the 
typical variations in the closed-loop system (however we may define typical) 
may better capture the designer's intention. 

Robustness specifications are often not closed-loop convex, just as the most gen- 
eral specifications that limit differential sensitivity are not closed-loop convex. We 
will describe a general small gain method for formulating convex inner approxima- 
tions of robustness specifications; the Notes and References for this chapter describe 
some of the attempts that have been made to make approximations of robustness 
specifications that are less conservative, but not convex. Since we will be describ- 
ing convex approximations of robustness specifications, we should add the following 
item to the list of possible disadvantages: 

• The small gain based convex inner approximations of robustness specifications 
can be poor approximations. Thus, designs based on these approximations 
can be conservative. 

This topic is addressed in some of the references at the end of this chapter. 

In the next section we give a precise and general definition of a robustness 
specification, which may appear abstract on first reading. In the remainder of this 
chapter we describe the framework for small gain methods, and then the small 
gain methods themselves. The framework and methods are demonstrated on some 
simple, specific examples that are based on our standard example SASS 1-DOF 
control system described in section 2.4. These examples continue throughout the 
chapter. 

10.1 Robustness Specifications 

10.1.1 Some Definitions 

In this section we give a careful definition of a robustness specification; we defer until 
the next section examples of common robustness specifications. Roughly speaking, 
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a robustness specification requires that some design specification T> must hold, even 
if the plant P is replaced by any pP ert from a specified set V of possible perturbed 
plants. 

Let us be more precise. Suppose that P is any set of (n w + n u ) x (n z + n y ) 
transfer matrices. We will refer to P as the perturbed plant set, and its elements as 
perturbed plants. Let V denote some design specification, i.e., a boolean function 
on n z x n w transfer matrices, and let K denote any n u x n y transfer matrix. 

Definition 10.1: We say V holds robustly for K and V if for each P pert 6 V , V 
holds for the transfer matrix Pj£ rt + Pf^K{I - P£; rt K)' 1 PP£ rt . 

In words, the design specification V holds robustly for K and P if the controller 
K connected to any of the perturbed plants pP ert g p yields a closed-loop system 
that satisfies V. Definition 10.1 is not, by itself, a design specification: it is a 
property of a controller and a set of transfer matrices. Note also that definition 10.1 
makes no mention of the plant P. 

Once we have the concept of a design specification holding robustly for a given 
controller and perturbed plant set, we can define the notion of a robustness speci- 
fication, which will involve the plant P. 

Definition 10.2: The robustness specification 2> ro b formed from V, V , and P is 
given by: 

D ro b : T2 holds robustly for K and V, 

for every K that satisfies 

H = P zw +P zu K{I-P yu K)- 1 P yw . (10.1) 

Thus a robustness specification is formed from a design specification V, a per- 
turbation plant set P, and the plant P. The reader should note that 2> ro b is indeed 
a design specification: it is a boolean function of H. We can interpret 2> ro b as fol- 
lows: if H satisfies X^ob and K is any controller that yields the closed-loop transfer 
matrix H (when connected to P), the closed- loop transfer matrix that results from 
connecting K to any pP ert g p will all satisfy V. 

A sensible formulation of the plant will including signals such as sensor and 
actuator noises and the sensor and actuator signals (recall chapter 7). In this case 
the controller K is uniquely determined by a closed-loop transfer matrix H that 
is realizable, since (/ — P yu K)~ x will appear as a submatrix of H, and we can 
determine K from this transfer matrix. In these cases we may substitute "the if" 
for "every K" in the definition 10.2. 

In many cases, P G P, and V consists of transfer matrices that are "close" to P. 
In this context P is sometimes called the nominal plant. In this case the robustness 
specification D ro b requires that even with the worst perturbed plant substituted for 
the nominal plant, the design specification V will continue to hold. 
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If the design specification V is Stable) i-e., closed-loop stability (see chapter 7), 
we call 2> ro b the robust stability design specification associated with V and P. 

Throughout this chapter, P is understood, so the robustness specification will 
be written 

V Ioh (V,V). 

The robust stability specification associated with the perturbed plant set V will be 
denoted 

^ ) rob_stab('P) = ^rob('P) ^stable)- 

10.1.2 Time-Varying and Nonlinear Perturbations 

It is possible to extend the perturbed plant set to include time- varying or nonlinear 
systems, although this requires some care since many of our basic concepts and 
notation depend on our assumption 2.2 that the plant is LTI. Such an extension 
is useful for designing a controller K for a nonlinear or time-varying plant p nonlm . 
The controller K is often designed for a "nominal" LTI plant P that is in some 
sense "close" to p nonlm ; pnonlm j g then considered to be a perturbation of P. 

In this section we briefly and informally describe how we may modify our frame- 
work to include such nonlinear or time-varying perturbations. In this case the per- 
turbed plant is a nonlinear or time- varying system with n w + n u inputs and n z + n y 
outputs. The perturbed closed-loop system, obtained by connecting the controller 
between the signals y and u of the perturbed plant, is now also nonlinear or time- 
varying, so the perturbed closed-loop system cannot be described by an n z x n w 
transfer matrix, as in (10.1). Instead, the closed-loop system is described by the 
nonlinear or time-varying closed-loop operator that maps w into the resulting z. 

A design specification will simply be a predicate of the closed-loop system. The 
only predicate that we will consider is closed-loop stability, which, roughly speaking, 
means that z is bounded whenever w is bounded (the definition of closed-loop 
stability given in chapter 7 does not apply here, since it refers to the transfer matrix 
H). The reader can consult the references given at the end of this chapter for precise 
and detailed definitions of closed- loop stability of nonlinear or time- varying systems. 

The robust stability specification I\objtab will mean that when the (LTI) con- 
troller K, which is designed on the basis of the (LTI) plant P, is connected to any 
of the nonlinear or time- varying perturbed plants in V, the resulting (nonlinear or 
time- varying) closed-loop system is stable. 

10.2 Examples of Robustness Specifications 

In this section we consider some examples of robustness specifications, organized 
by their associated plant perturbation sets. Most of these robustness specifications 
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are not convex, but later in this chapter we describe a general method of forming 
convex inner approximations of these specifications. 



10.2.1 Finite Plant Perturbation Sets 

A simple but important case occurs when V is a finite set: 

r = {p u ...,p N }. 



(10.2) 



Neglected Dynamics 

Recall from chapter 1 that P may be a simple (but not very accurate) model of 
the system to be controlled. Our perturbed plant set might then be V = {p cm P lx } ) 
where p cm P lx is a complex, detailed, and accurate model of the system to be con- 
trolled. In this case, the robustness specification V Torj guarantees that the controller 
we design using the simple model P, will, when connected to p cm P lx ) satisfy the 
design specification V. 

As a specific example, suppose that our plant is our standard numerical example, 
the 1-DOF controller described in section 2.4, with plant 



P = 



The more detailed model of the system to be controlled might take into account a 
high frequency resonance and roll-off in the system dynamics and some fast sensor 
dynamics, neither of which is included in the plant model P: 
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where 



,cmplx (s) = 



Pistd 
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(-) 



l + 1.25(s/100) + (s/100) 2 ' 

This is shown in figure 10.1 below (c./. figure 2.11) 
For this example, the perturbed plant set is 

•p _ fpcmplx-i 



-^sens^J — 



1 + s/80 



(10.3) 



The robust stability specification I\ob_stab that corresponds to (10.3) requires 
that the controller designed on the basis of the nominal plant P will also stabilize the 
complex model p cm P lx of the system to be controlled. Roughly speaking, V Iorj Jta b 
requires that the system cannot be made unstable by the high frequency resonance 
and roll-off in the system dynamics and the dynamics of the sensor, which are 
ignored in the model P. 
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Figure 10.1 The controller K, which is designed on the basis of the model 
Po td , is connected to a more detailed model of the system to be controlled, 
pcmpix. The model p cm P lx includes a high frequency resonance and roll-off 
in Pg mp x , and the sensor dynamics (1 + s/80) -1 . 



Failure Modes 

The perturbed plants in (10.2) may represent different failure modes of the system to 
be controlled. For example, Pi might be a model of the system to be controlled after 
an actuator has failed (i.e., Pi is P, but with the column associated with the failed 
actuator set to zero). In this case the specification of robust stability guarantees 
that the closed-loop system will remain stable, despite the failures modeled by 
Pi,...,P N . 



10.2.2 Parametrized Plant Perturbations 

In some cases the perturbed plant set V can be described by some parameters that 
vary over ranges: 

V={P pevt (a) | Li<ai<Ui,...,L k <a k <U k }. 

In this case we often have P 6 V; the corresponding parameter is called the nominal 
parameter: 

p — ppert/ nom\ 

Parametrized plant perturbation sets can be used to model several different 
types of plant variation: 

• Component tolerances. A single controller K is to be designed for many 
plants, for example, a manufacturing run of the system to be controlled. The 
controller is designed on the basis of a nominal plant; the parameter variations 
represent the (slight, one hopes) differences in the individual manufactured 
systems. Designing a controller that robustly achieves the design specifications 
avoids the need and cost of tuning each manufactured control system (see 
section 1.1.5). 
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• Component drift or aging. A controller is designed for a system that is well 
modeled by P, but it is desired that the system should continue to work if or 
when the system to be controlled changes, due to aging or drift in its compo- 
nents. Designing a controller that robustly achieves the design specifications 
avoids the need and cost of periodically re-tuning the control system. 

• Externally induced changes. The system to be controlled may be well mod- 
eled as an LTI system that depends on an external operating condition, which 
changes slowly compared to the system dynamics. Examples include temper- 
ature induced variations in a system, and the effects of varying aerodynamic 
pressure on aircraft dynamics. Designing a controller that robustly achieves 
the design specifications can avoid the need for a gain- scheduled or adaptive 
controller. (See the Notes and References in chapter 2.) 

• Model parameter uncertainty. A parametrized perturbed plant set can model 
uncertainty in modeling the system to be controlled (see section 1.1.2). In 
a model developed from physical principles, the on may represent physical 
parameters such as lengths, masses, and heat conduction coefficients, and 
the bounds Lj and Ui are then minimum and maximum values that could 
be expected to occur. In a black box model derived from an identification 
procedure, the on could represent transfer function coefficients, and the Lj 
and Ui might represent the 90% confidence bands for the identified model. 

Example: Gain Margins 

Gain margin specifications are examples of classical robustness specifications that 
are associated with a parametrized plant perturbation set. We consider the classical 
1-DOF controller, with perturbed plant set described informally by 

P pert (s) = aP {s), L<a<U. 

More precisely, we have the perturbed plant set 



V= < 



aP 




aP 
1 



-aP 



L<a<U 



= 1, (10-4) 



-aP -1 1 

where < L < 1 < U. 

The specification of robust stability with the perturbed plant set (10.4) is de- 
scribed in classical terms as a positive gain margin of 20 log 10 UdB and a negative 
gain margin of 20 log 10 LdB. 

As an example we will use later, the robustness specification that requires gain 
margins of +4dB and — 3.5dB is given by 

£>+4,-3.5db^m : V Toh ^ tab {T) with L = 0.668, U = 1.585, (10.5) 

with the plant perturbation set (10.4). 
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Example: Pole Variation 

The parameter vector a may determine pole or zero locations in the plant. As a 
specific example, consider the perturbed plant set for the standard 1-DOF example 
of section 2.4, described informally by 



pr\s) = 



1 a 



s 2 a + s 



5 < a < 15, 



a 



= 10. 



(10.6) 



Some of these phase variations in Pg are shown in figure 10.2 
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Figure 10.2 The perturbed plants described by (10.6) consist of a phase 
shift in Po td; shown here are the phase shifts for a — a nom = ±1, ±3, ±5. 
(c.f. figure 9.2, which shows a particular phase shift.) 

The robust stability specification Drobjtab is a strengthening of the stability 
specification Stable: I'rob^tab requires that the controller stabilize not only the 
plant P, but also the perturbed plants (10.6). Drob^tab can be thought of as a type 
of phase margin specification. 



10.2.3 Unknown-but-Bounded Transfer Function Perturbations 

It is often useful to model the uncertainty in the plant (as a model of the system 
to be controlled) as frequency-dependent errors in the frequency responses of its 
entries. Such plant perturbation sets can be used to account for: 

• Model uncertainty. The plant transfer functions may inaccurately model the 
system to be controlled because of measurement or identification errors. For 
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example, the transfer functions of the system to be controlled may have been 
measured at each frequency to an accuracy of 1%, or these measurements 
might be repeatable only to 1%. 

• High frequency parasitic dynamics. A model of the system to be controlled 
may become less accurate at high frequencies because of unknown or unmod- 
eled parasitic dynamics. Moreover these parasitic dynamics may change with 
time or other physical parameters, and so cannot be confidently modeled. In 
electrical systems, for example, we may have small stray capacitances and 
(self and mutual) inductances between conductors; these parasitic dynamics 
can change significantly when the electrical or magnetic environment of the 
system is changed. 

In state-space plant descriptions, the addition of high frequency parasitic dy- 
namics is called a singular perturbation, because the perturbed plant has more 
states than the plant. 

Since these plant perturbation sets cannot be described by the variation of a 
small number of real parameters, they are sometimes called nonparametric plant 
perturbations. 

There are some subtle distinctions between intentionally neglected system dy- 
namics that could in principle be modeled, and parasitic dynamics that cannot be 
confidently modeled. For example, a model of a mechanical system may be devel- 
oped on the assumption that a drive train is rigid, an assumption that is good at 
low frequencies, but poor at high frequencies. If the high frequency dynamics of this 
drive train could be accurately modeled or consistently measured, then we could 
develop a more accurate (and more complex) model of the system to be controlled, 
as in section 10.2.1. However, it may be the case that these high frequency dynam- 
ics are very sensitive to minor physical variations in the system, such as might be 
induced by temperature changes, bearing wear, and so on. In this case the drive 
train dynamics could reasonably be modeled as an unknown transfer function that 
is close to one at low frequencies, and less close at high frequencies. 

Example: Relative Uncertainty in Po 

We consider again our standard SASS 1-DOF control system example. Suppose we 
believe that the relative or fractional error in the transfer function Po (as a model of 
the system to be controlled) is about 20% at low frequencies (say, w < 5), and much 
larger at high frequencies (say, up to 400% for w > 500). We define the relative 
error as 



r>rel_err 


A Po 3 "* " 

Po 


Po 


ppert _ 


:(1+P rel - 


" err )Po 



so that 

(10.7) 
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Our plant model uncertainty can be described by a frequency-dependent limit on 
the magnitude of Pg el - err , e.g., 



I prel_err /tit ii <- -i 

\ sr / "' rel^err ||oo _ - 1 -) 



(10.8) 



where 



W re l_err(s) = 0.2 



1 + 5/10 

l + s/200' 



(10.9) 



We interpret |W r ei^err(i w )|) which is plotted in figure 10.3, as the maximum relative 
error in P (jw). We say that pj el - err is an unknown-but-bounded transfer function. 
One interpretation is shown in figure 10.4. 
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Figure 10.3 From (10.8), |W re i^err(.?'i')| is the maximum relative error in 
Po(jw). This represents a relative error of 20% at low frequencies and up 
to 400% at high frequencies. 

The perturbed plant set for this example is thus 



V= < 



ppert 








ppert j 1 



ppert 



^pert 



-jrel_err 



/ " rel_err || OQ — 



< 1 



. , (10.10) 



where P^ er is given by (10.7). 

We note for future reference two robustness specifications using the perturbed 
plant set (10.10). The first is robust stability, I> r objtab> and the second is the 



10.2 Examples of Robustness Specifications 



219 



3(P pe "(jo.)) 




Figure 10.4 The complex transfer function Pg elt (jui) is shown versus fre- 
quency u. Circles that are centered at the nominal plant transfer function, 
Po(jw), with radius |W re i_err(j'i')Po(jw)| are also shown. (10.7) and (10.8) 
require that the perturbed plant transfer function, P pert , must lie within 
the region enclosed by these circles. 



stronger specification that these plant perturbations never cause the RMS gain 
from the reference input to the actuator signal to exceed 75: 



V Ioh (V, \\T/P \\oo <75) 



(10.11) 



{T/P is the transfer function from the reference input to the actuator signal). 

10.2.4 Neglected Nonlinearities 

Example: Actuator Saturation 

A common nonlinearity encountered in systems to be controlled is saturation of the 
actuator signals, shown in figure 10.5. This system is described by 
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(10.12) 
(10.13) 
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where P hn is an LTI system and the unit saturation function Sat : R — ► R is 
defined by 



Sat(a) = < 



a \a\ < 1 
1 a > 1 
-1 a < -1. 



(10.14) 



Si is called the saturation threshold of the ith actuator: if the magnitude of each 
actuator signal is always below its saturation threshold, then the system is LTI, 
with transfer matrix P hn . 
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Figure 10.5 A system to be controlled consists of the LTI system P lln , 
driven by a saturated actuator signal, u sat . The block diagram for the 
saturation nonlinearity shows its graph. 



One approach to designing a controller for this nonlinear system is to define the 
LTI plant P = P hn , and consider the saturation as a nonlinear perturbation of P. 
Thus we consider the perturbed plant set consisting of the single nonlinear system 
given by (10.12-10.13): 



-p __ rpnonlini 



(10.15) 



By designing an LTI controller for P that yields robust stability for (10.15), we 
are guaranteed that when the controller is connected to the nonlinear system to 
be controlled, the resulting nonlinear closed-loop system will at least be stable. If 
the actuator signals only occasionally exceed their thresholds, then the closed-loop 
transfer matrix H can give a good approximation of the behavior of the nonlinear 
closed-loop system. (Another useful approach, also based on the idea of considering 
the saturators as a perturbation of an LTI plant, is described in the Notes and 
References.) 

We should also mention the describing function method, a heuristic approach 
that is often successful in practice. Roughly speaking, the describing function 
method approximates the effect of the saturators as a gain reduction in the ac- 
tuator channels: such perturbations are handled via gain margin specifications. Of 
course, gain margin specifications do not guarantee that the nonlinear closed-loop 
system is stable. 
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10.3 Perturbation Feedback Form 

In many cases the perturbed plant set V can be represented as the nominal plant 
with an internal feedback, as shown in figure 10.6. When the internal feedback A is 
zero, we recover the nominal plant P; each perturbed plant in V corresponds to a 
particular feedback A G A, where A is a set of transfer matrices of the appropriate 
size. 
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Figure 10.6 Each perturbed plant is equivalent to the nominal plant mod- 
ified by the internal feedback A. 



We will call A the feedback perturbation. The perturbed plant that results from 
the feedback perturbation A will be denoted P pert (A), and A will be called the 
feedback perturbation set that corresponds to V: 



V = (P pert (A) | Ae A}. 



(10.16) 



The symbol A emphasizes its role in "changing" the plant P into the perturbed 
plant P pert . 

The input signal to the perturbation feedback, denoted q, can be considered an 
output signal of the plant P. Similarly, the output signal from the perturbation 
feedback, denoted p, can be considered an input signal to the plant P. Throughout 
this chapter we will assume that the exogenous input signal w and the regulated 
output signal z are augmented to contain p and q, respectively: 



w = 



w 
P 



z = 



where w and z denote the original signals from figure 10.6. This is shown in fig- 
ure 10.7. 

To call p an exogenous input signal can be misleading, since this signal does 
not originate "outside" the plant, like command inputs or disturbance signals, as 
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Figure 10.7 The plant P showing p as part of the exogenous input signal 
w and q as part of the regulated output signal z. By connecting the feedback 
perturbation A between q and p, we recover the perturbed plant P pert (A). 



the term exogenous implies. We can think of the signal p as originating outside the 
nominal plant, as in figure 10.6. 

To describe a perturbation feedback form of a perturbed plant set V, we give 
the (augmented) plant transfer matrix 



P = 



along with the set A of perturbation feedbacks. Our original perturbed plant can 
be expressed as 
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Ail-PgpA)- 1 [ P qw | P qu ] .(10.17) 



The perturbation feedback form, i.e., the transfer matrix P in (10.17) and the 
set A, is not uniquely determined by the perturbed plant set V. This fact will be 
important later. 

When V contains nonlinear or time-varying systems, the perturbation feedback 
form consists of an LTI P and a set A of nonlinear or time- varying systems. Roughly 
speaking, the feedback perturbation A represents the extracted nonlinear or time- 
varying part of the system. We will see an example of this later. 



10.3.1 Perturbation Feedback Form: Closed-Loop 

Suppose now that the controller K is connected to the perturbed plant P pert (A), 
as shown in figures 10.8 and 10.9. 
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Figure 10.8 When the perturbed plant set is expressed in the perturbation 
feedback form shown in figure 10.6, the perturbed closed-loop system can be 
represented as the nominal plant P, with the controller K connected between 
y and u as usual, and the perturbation feedback A connected between q and 
P- 
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Figure 10.9 The perturbed closed-loop system can be represented as the 
nominal closed-loop system with feedback A connected from q (a part of z) 
to p (a part of w). Note the similarity to figure 2.2. 



224 Chapter 10 Robustness Specifications via Gain Bounds 

By substituting (10.17) into (2.7) we find that the transfer matrix of the per- 
turbed closed-loop system is 

tf pert (A) = Hm + H zp A(I - HqpA)- 1 ^*, (10.18) 

where 

H m = P m + P ZU K(I - PyuKy^Py* (10.19) 

H zp = P zp + P ZU K{I - P yu KY 1 P yv (10.20) 

H qib = P q a, + P qu K{I - PyuK^Pyx, (10.21) 

H qp = P qp + P qu K{I - PyuK^Pyp. (10.22) 

Note the similarities between figures 10.9 and 2.2, and the corresponding equa- 
tions (10.18) and (2.7). Figure 2.2 and equation (2.7) show the effect of connect- 
ing the controller to the nominal plant to form the nominal closed-loop system; 
figure 10.9 and equation (10.18) show the effect of connecting the feedback per- 
turbation A to the nominal closed-loop system to form the perturbed closed-loop 
system. 

We may interpret 

tf pert (A) - H M = H zp A(I - HgpA)- 1 ^* (10.23) 

as the change in the closed-loop transfer matrix that is caused by the feedback 
perturbation A. We have the following interpretations: 

• H Z w is the closed-loop transfer matrix of the nominal system, before its ex- 
ogenous input and regulated output were augmented with the signals p and 

q- 

• Hqti is the closed-loop transfer matrix from the original exogenous input signal 
w to q. If H q ti is "large", then so will be the signal q that drives or excites 
the feedback perturbation A. 

• H zp is the closed-loop transfer matrix from p to the original regulated output 
signal z . If H zp is "large" , then so will be the effect on z of the signal p, which 
is generated by the feedback perturbation A. 

• H qp is the closed-loop transfer matrix from p to q. We can interpret H qp as 
the feedback seen by A, looking into the nominal closed- loop system. 

Thus, if the three closed-loop transfer matrices H zp , H q $, and H qp are all "small", 
then our design will be "robust" to the perturbations, i.e., the change in the closed- 
loop transfer matrix, which is given in (10.23), will also be "small". This vague idea 
will be made more precise later in this chapter. 
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10.3.2 Examples of Perturbation Feedback Form 

In this section • will denote a transfer function that we have already given elsewhere. 
In this way we emphasize the transfer functions that are directly relevant to the 
perturbation feedback form. 



Neglected Dynamics 

Figure 10.10 shows one way to represent the perturbed plant set V = {p cm P lx } 
described in section 10.2.1 in perturbation feedback form. In this block diagram, 
the perturbation feedback A acts as a switch: A = yields the nominal plant; 
A = / turns on the perturbation, to yield the perturbed plant p cm P lx . 
This perturbation feedback form is described by the augmented plant 



(10.24) 
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where 



P£{s) = 



-1.25(s/100) - (s/100) 2 



l + 1.25(s/100) + (s/100) 2 ' 
and the feedback perturbation set 

A = {/}. 
When the controller is connected, we have 
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(10.25) 

(10.26) 
(10.27) 

(10.28) 



Gain Margin: Perturbation Feedback Form 1 

The perturbed plant set for the classical gain margin specification, given by V 
in (10.4), can be expressed in the perturbation feedback form shown in figure 10.11. 
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Figure 10.10 Perturbation feedback form for the perturbed plant set that 
consists of the single transfer matrix p cm P lx . The feedback perturbation A 
acts as a switch: A = yields the nominal plant; A = I yields the perturbed 
plant p cm P'*. 



This perturbation feedback form is described by the augmented plant 
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(10.29) 



and the perturbation feedback set 



A = [L-1,U-1], 



(10.30) 



which is an interval. Thus, the feedback perturbations are real constants, or gains. 
Informally, the perturbation A causes P to become (1 + A)P . 

For this perturbation feedback form, the transfer matrices H q &, H% p , and H qp 
are given by 



H^ = [ -T -T/P T/P ] , 



Hz P = 



H qp — 



PoS 
-T 



(10.31) 
(10.32) 
(10.33) 
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Figure 10.11 One possible perturbation feedback form for the classical 
gain margin specification. 

Gain Margin: Perturbation Feedback Form 2 

The same perturbed plant set, (10.4), can be described in the different perturbation 
feedback form shown in figure 10.12, for which 
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(10.34) 



(only one entry differs from (10.29)), and 

A = [l-1/L,l-1/U], (10.35) 

which is a different interval than (10.30). For this perturbation feedback form, A 
represents a constant that causes Po to become Po/(l — A). 
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Figure 10.12 Another perturbation feedback form for the classical gain 
margin specification. 

For this second perturbation feedback form, the transfer matrices H q w, H% p , and 



H qp are given by 



H& = [ -T -T/P T/P ] , 



(10.36) 
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Hz P = 



PoS 
-T 



(10.37) 

(10.38) 

Only H qp differs from its corresponding expression for the previous perturbation 
feedback form. 



tiqp — £>• 



Pole Variation 

In some cases, a plant pole or zero that depends on a parameter can be expressed 
in perturbation feedback form. As a specific example, figure 10.13 shows one way 
to express the specific example described in section 10.2.2 in perturbation feedback 
form. 




Figure 10.13 The variation in the phase shift of Po, described by (10.6), 
can be represented as the effect of a varying feedback gain A inside the 
plant. 



Here we have 
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with feedback perturbation set 
A = [-5, 5]. 
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(10.39) 



(10.40) 
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The closed-loop transfer matrices are given by 



Hqw — 
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U qp — 



s 2 (s + 10) 
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2s _ 
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(10.41) 



(10.42) 



(10.43) 



(The reader worried that these transfer matrices may be unstable should recall 
that the interpolation conditions of section 7.2.5 require that T(10) = 0; similar 
conditions guarantee that these transfer matrices are proper, and have no pole at 
s = 0.) 



Relative Uncertainty in Po 

The plant perturbation set (10.10) can be expressed in the perturbation feedback 
form shown in figure 10.14, for which 
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and 



A = {A 



<!}■ 



(10.44) 



(10.45) 



In this case the feedback perturbations are normalized unknown-but-bounded trans- 
fer functions, that cause the transfer function Pq to become (1 + W ie i_ e rrA)Po- 

For this perturbation feedback form, the transfer matrices H q w, H zp , and H qp 
are given by 



qtv = [ -T -T/P T/P ] , 

W Ie l_enPoS 
" rel_err-^ 



H, 



Hzp = 



H qp — 



(10.46) 
(10.47) 

(10.48) 



Saturating Actuators 

We consider the perturbed plant set (10.15) which consists of the single nonlinear 
system {p nonlm j.. This can be expressed in perturbation feedback form by express- 
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Figure 10.14 A perturbation feedback form for the plant perturbation 
set (10.10). 



ing each saturator as a straight signal path perturbed by a dead-zone nonlinearity, 

Dz(a) = a — Sat(a), 
as shown in figure 10.15. 
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Figure 10.15 The nonlinear system shown in figure 10.5 is redrawn as 
the nominal plant, which is LTI, connected to A nonlln , which is a dead-zone 
nonlinearity. 

This perturbation feedback form is described by the augmented plant 
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(10.49) 
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and A = {A nonlin }, where p = A nonlin (g) is defined by 

p i {t)=S i T>z{q i {t)/S i ), i = l,...,n u . (10.50) 

In this case the feedback perturbation A nonlm is a memoryless nonlinearity. 

For this perturbation feedback form, the transfer matrices H q $, Hz p , and H qp 
are given by 

Hga = K{I - P£K )- 1 P ! J i °, (10.51) 

Hz P = -P?S(I - KP%)-\ (10.52) 

H qp = -KP^(I - KFft)- 1 . (10.53) 

10.4 Small Gain Method for Robust Stability 

10.4.1 A Convex Inner Approximation 

We consider a perturbed plant set V that is given by a perturbation feedback form. 
Suppose that the norm || • || gn is a gain (see chapter 5). Let M denote the maximum 
gain of the possible feedback perturbations, i.e., 

M= sup ||A|| gn . (10.54) 

AeA 

M is thus a measure of how "big" the feedback perturbations can be. 

Then from the small gain theorem described in section 5.4.2 (equations (5.29- 
5.31) with Hi = A and H2 = H qp ) we know that if 

||#gp|| gn M < 1 (10.55) 

then we have for all A £ A, 

M 



A(i"-ff„A) _1 < — 

I V QP ) || gn — j 



M\\H qp \\ m 
From (10.23) we therefore have 

||#pert (A) _ jyll < M\\Hz\\ \\H*,\\ gn ^ ^ A ^ 

II v 1 ^"'llgn — I _ ]\/f\\H 

II ^Pllgn 

We will refer to the closed-loop convex specification (10.55) as the small gain condi- 
tion (for the perturbation feedback form and gain used). (10.55) and (10.56) are a 
precise statement of the idea expressed in section 10.3.1: the closed-loop system will 
be robust if the three closed-loop transfer matrices H% p , H q a,, and H qp are "small 
enough" . 
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It follows that the closed-loop convex specification on H given by 

||.ff„,||gn < 1/Af, (10.57) 

||-Hzp||gn < oo, (10.58) 

||-ffg»||gn < OO, (10.59) 

||fl*B||gn<00, (10.60) 

implies that 

||-ff pert || g n< oo for all A 6 A, (10.61) 

i.e., the robustness specification formed from the perturbed plant set V and the 
specification ||ff || gn < oo holds. If the gain || • || gn is finite only for stable transfer 
matrices, then the specification (10.57-10.60) implies that H peTt is stable, and thus 
the specification (10.57-10.60) is stronger than the specification of robust stability. 
In this case, we may think of the specification (10.57-10.60) as a closed-loop convex 
specification that guarantees robust stability. 

As a more specific example, the RMS gain (Hqo norm) || • ||oo is finite only for 
stable transfer matrices, so the specification ||-ffq P ||oo < 1/M, along with stability 
of H% p , H q $, and H^a, (which is usually implied by internal stability), guarantees 
that the robust stability specification 2> ro bjtab holds for H. 

The specification (10.57-10.60) can be used to form a convex inner approxi- 
mation of a robust generalized stability specification, for various generalizations of 
internal stability (see section 7.5), by using other gains. As an example, consider 
the a-shifted H,*, norm, which is finite if and only if the poles of its argument have 
real parts less than —a. If the specification (10.57-10.60) holds for this norm, then 
we may conclude that the feedback perturbations cannot cause the poles of the 
closed-loop system to have real parts equal to or exceeding —a. (We comment that 
changing the gain used will generally change M, and will give a different specifica- 
tion.) 

Since the small gain condition (10.55) depends only on M, the largest gain of 
the feedback perturbations, it follows that the conclusion (10.56) actually holds for 
a set of feedback perturbations that may be larger than A: 

A sgt = {A | ||A|| gn <M}DA. 

By using different perturbation feedback forms of a perturbed plant set, and 
different gains that are finite only for stable transfer matrices, the small gain condi- 
tion (10.55) can be used to form different convex inner approximations of the robust 
stability specification. 

10.4.2 An Extrapolated First Order Bound 

It is interesting to compare (10.56) to a corresponding bound that is based on a 
first order differential analysis. Since 

A(7 - H^A)- 1 ~ A 
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(recall that ~ means equals, to first order in A), the first order variation in the 
closed-loop transfer matrix is given by 

#pert ( A) _ H „ „ H . pAHqi . (10.62) 

(c.f. the exact expression given in (10.23)). If we use this first order analysis to 
extrapolate the effects of any A € A, we have the approximate bound 

||#P ert (A) - H^\\ gn < M||i^ p || gn ||tf^|| gn (10.63) 

(c.f. the small gain bound given in (10.56); < means that the inequality holds to 
first order in A). 

The small gain bound (10.56) can be interpreted as the extrapolated first order 
differential bound (10.63), aggravated (increased) by a term that represents the 
"margin" in the small gain condition, i.e. 

(10.64) 



M\\H, 



qpugn 



Of course, the small gain bound (10.56) is correct, whereas extrapolations from the 
first order bound (10.63) need not hold. 

Continuing this comparison, we can interpret the term (10.64) as representing 
the higher order effects of the feedback perturbation A, since 

- = 1 + (M\\H qp \\ sn ) + (M||tf gp || gn ) 2 + (M||tf gp || gn ) 3 + • • • . 



1 H/TUTJ II — ~ ' V— ir J gpllgn; -r V" M"gpllgn; -r V J " N-"gpllgn 

1 - M\\N qp \\ gn 

The bound (10.63), which keeps just the first term in this series, only accounts for 
the first order effects. 

It is interesting that the transfer matrix H qp , which is important in the small gain 
based approximation of robust stability, has no effect whatever on the first order 
change in H, given by (10.62). Thus H qp , the feedback "seen" by the feedback 
perturbation, has no first order effects, but is key to robust stability. 

Conversely, the transfer matrices H% p and H q u,, which by (10.62) determine the 
first order variation of H with respect to changes in A, have little to do with robust 
stability. Thus, differential sensitivity of H, which depends only on H q a, and Hz p , 
and robust stability, which depends only on H qp , measure different aspects of the 
robustness of the closed-loop system. 

10.4.3 Nonlinear or Time-Varying Perturbations 

A variation of the small gain method can be used to form an inner approximation 
of the robust stability specification, when the feedback perturbations are nonlinear 
or time-varying. In this case we define M by 



/ ||Atu|| sig 

i — n — n — 
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""|sig 



A e A, IHIsig > 0} (10.65) 
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where || • || s i g denotes the norm on signals that determines the gain || • || gn . 

With this definition of M (which coincides with (10.54) when each A G A is 
LTI), the specification (10.57-10.60) implies robust stability. In fact, a close analog 
of (10.56) holds: for all A € A and exogenous inputs w, we have 



\z 



pert -f|| . <alhB| 



Isig 



Slg) 



where 



MIT TT 

1 1 -" zp 1 1 gn 1 1 -" qw 1 1 gn 

1 — M||i?qp|| gn 

Thus, we have a bound on the perturbations in the regulated variables that can be 
induced by the nonlinear or time- varying perturbations. 

10.4.4 Examples 

In this section we apply the small gain method to form inner approximations of 
some of the robust stability specifications that we have considered so far. In a 
few cases we will derive different convex inner approximations of the same robust 
stability specification, either by using different perturbation feedback forms for the 
same perturbed plant set, or by using different gains in the small gain theorem. 

Neglected Dynamics: RMS Gain 

We now consider the specification of internal stability along with the robust stability 
specification for the perturbation plant set (10.3), i.e., I\ob_stab n2> sta bie- We will 
use the perturbation feedback form given by (10.24-10.25). The specification that 
-ffgii) H q w, and H% v are stable is weaker than Stable) so we will concentrate on 
the small gain condition (10.57). No matter which gain we use, M is one, since A 
contains only one transfer matrix, the 2x2 identity matrix. 

We first use the RMS gain, i.e., the Hqo norm || • ||oo, which is finite only for 
stable transfer matrices. The small gain condition is then 



I Tf II 
I QP 1 1 °° 



_ r>Wrp rp-pV*) 

-*err -*■ -*• -*err 

p(l) c _T , p( 2 ) 
-*err & -L -*err 



< 1. (10.66) 



The closed-loop convex specification (10.66) (together with internal stability) is 
stronger than the robust stability specification 2> ro bjtab with the plant perturba- 
tion set (10.25) (together with internal stability): if (10.66) is satisfied, then the 
corresponding controller also stabilizes p cm P lx . 

We can interpret the specification (10.66) as limiting the bandwidth of the 
closed-loop system. The specification (10.66) can be crudely considered a frequency- 
dependent limit on the size of T; since P err and P err are each highpass filters, this 
limit is large for low frequencies, but less than one at high frequencies where P er r and 
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Pen have magnitudes nearly one, e.g., u > 100. Thus, (10.66) requires that ^(jw)! 
is less than one above about w = 100; in classical terms, the control bandwidth is 
less than lOOrad/sec. Figure 10.16(a) shows the actual region of the complex plane 
that the specification (10.66) requires T(20j) to lie in; figure 10.16(b) shows the 
same region for T(200j). 



o 




o 1 

5RT(20j) SRT(200.?) 

(a) (b) 

Figure 10.16 The specification (10.66) requires T(jui) to lie in the shaded 
region (a) for ui = 20, and (b) for w = 200. 



Neglected Dynamics: Scaled RMS Gain 

We now apply the small gain method to the same example, using the same pertur- 
bation feedback form, substituting a scaled H^ norm for the H^ norm used above 
(see section 5.3.6). We use the scaling 
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and the associated scaled gain, 
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This norm is finite only for stable transfer matrices, so the small gain condition 
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(10.67) 



together with internal stability, is stronger than the robust stability specification 
I'robjtab with the plant perturbation set (10.25) (together with internal stability). 
Like the specification (10.66), the specification (10.67) can be thought of as limiting 
the bandwidth of the closed-loop system. 
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Figure 10.17(a) shows the actual region of the complex plane that the spec- 
ification (10.67) requires T(20j) to lie in; figure 10.17(b) shows the same region 
for T(200j); comparing these figures to figures 10.16(a) and 10.16(b), we see that 
the two inner approximations of robust stability given by (10.66) and (10.67) are 
different: neither is stronger than the other. 
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Figure 10.17 The specification (10.67) requires T(jui) to lie in the shaded 
region (a) for ui = 20, and (b) for ui = 200. The boundaries of the regions 
from figure 10.16, for the specification (10.66), are shown with a dashed 
line. 




Gain Margin: Perturbation Feedback Form 1 

We now consider the gain margin specification, using the RMS gain in the small 
gain theorem, with the perturbation feedback form given by (10.29-10.30). The 
maximum of the RMS gains of the perturbations is 

M = max{||L - l||oo, \\U - l||oo} = max{l - L, U - 1}. 

For this perturbation feedback form, H qp = — T, so the small gain condition is 



|T|| 00 <l/M = min{ T ^, ^} 



For the specific gain margins of +4dB and — 3.5dB, i.e., the robustness specifica- 
tion (10.5), the convex inner approximation is 



imu < i.7i. 



(10.68) 



The closed-loop convex specification (10.68) is stronger than the gain margin 
specification (10.5). 
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Gain Margin: Perturbation Feedback Form 2 

We now use the second perturbation feedback form for the gain margin problem, 
given by (10.34-10.35). For this perturbation feedback form, we have 

M = max{l/L - 1, 1 - 1/U}, 

and H qp = S, so the small gain condition is 

L U 



/M = min{ 1 ^, JL-} 



For the specific gain margins of +4dB, — 3.5dB, we have 
||S||oo < 2.02. 



(10.69) 



(10.70) 



The closed-loop convex specification (10.70) is a different inner approximation 
of the gain margin specification (10.5) than (10.68). 
Thus we have 

Halloo < 1.71 =>■ D+4,-3.5db_gm> Halloo < 2.02 =>■ D+4,-3.5db_gm> 

but neither of the convex specifications on the left-hand sides is stronger than the 
other. We can contrast the two specifications by expressing (10.70) as ||1 — TW^ < 
2.02: see figure 10.18. 



3 




\T{jo>)\ < 1.71 



-2-10 1 2 3 4 

Figure 10.18 The specifications (10.68) and (10.70) require the complex 
number T{jui) to lie in the indicated circles at each frequency u>. 
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A Generalized Gain Margin 

We consider again the perturbed plant set for the gain margin specification, but 
tighten our robustness specification to require that the perturbed closed-loop system 
should have a stability degree that exceeds a > 0, i.e., the poles of H peit should have 
real parts less than —a. To form a convex inner approximation of this robustness 
specification, we apply the small gain method to the perturbation feedback form 
given by (10.34-10.35) and the a-shifted Hqo norm. For this norm (indeed, for any 
gain) we find that 

M = max{l/L - 1, 1 - 1/U}, 

just as for the unshifted H^ norm. 

The convex inner approximation is then: H^wi H^ v , and H q a, (given in (10.36- 
10.38)) have stability degrees exceeding a (i.e., finite || • ||oo,a norms) and 



11511^ <min{ T A_, ^L_Y 



For the specific generalized gain margin of +4dB, — 3.5dB, with a minimum stability 
degree of 0.2, we have the convex inner approximation 

||5||oo,o.2 < 2.02. (10.71) 

Pole Variation 

We now consider the perturbed plant set (10.6) from section 10.2.2. We will form 
the small gain condition using the perturbation feedback form (10.39-10.40), and 
the RMS gain. The maximum RMS gain of the feedback perturbations is 5, so we 
have the approximation 



25 



100 s + 10 



< 1/5. (10.72) 



We can interpret this closed-loop convex specification as follows. The weighting 
on T is a bandpass filter, whose peak magnitude is 0.1, at w = 10. The specifica- 
tion (10.72), roughly speaking, constrains T for frequencies near w = 10. 

Relative Uncertainty in Pq 

We will use the perturbation feedback form (10.44) described above, with the RMS 
gain. In this case we have M = 1, so the small gain condition becomes the weighted 
Hqo norm specification 

HWreLerrTlloo < 1. (10.73) 

This requires that the magnitude of T lie below the frequency-dependent limit 
l/|Wrei_err(J w )|) as shown in figure 10.19. 
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Figure 10.19 A convex inner approximation of robust stability with the 
relative plant uncertainty (10.7) requires that the magnitude of T lie below 
the frequency-dependent limit l/|W r ei-err(j'»')|. 



Saturating Actuators 

We consider the perturbed plant set (10.15), with the perturbation feedback form 
given by (10.49) and (10.50). We will use the RMS gain or Hqo norm. From 
formula (10.65) we find that M = 1, since the RMS value of the dead-zone output 
is less than the RMS value of its input, and for large constant signals, the two RMS 
values are close. The small gain condition for robust stability is thus 



KP lin (I - KP 



lin\ 
yu ) 



< 1. 



(10.74) 



Thus, if the closed-loop convex specification (10.74) is satisfied, then an LTI 
controller designed on the basis of the linear model P hn will at least stabilize the 
nonlinear system that includes the actuator saturation. 

10.5 Small Gain Method for Robust Performance 

10.5.1 A Convex Inner Approximation 

A variation of the small gain method can be used to form convex inner approxima- 
tions of robustness specifications that involve a gain bound such as 



\H. 



zw ||gn 



<<*, 
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where Hz-a is some entry or submatrix of H (c.f. robust stability, which involves 
the gain bound ||-ff?i5||oo < oo). 

Throughout this section, we will consider the robustness specification that is 
formed from the perturbed plant set V and the RMS gain bound specification 



\H, 



< 1. 



(10.75) 



We will refer to this robust performance specification as 2} ro b_perf- We will also 
assume that the perturbed plant set V is described by a perturbation feedback 
form for which the maximum RMS gain of the feedback perturbations is one, i.e., 
M = 1 in (10.54). 

The inner approximation of 2> ro b_perf is 



H-zw H-zp 



H, 



qw 



H 



ip 



< 1. 



(10.76) 



Like the inner approximation (10.57-10.60) of the robust stability specification 
I\objtab) we can interpret (10.76) as limiting the size of H zp , H q $, and H qp . 

Let us show that (10.76) implies that the specification (10.75) holds robustly, 
i.e., 



H q w\\oo < 1 for all A 6 A. 



\\Hzw + H zp A (I - H qp A) 
Assume that (10.76) holds, so that for any signals w and p we have 





z 




< 

rms 




W 

. P . 





(10.77) 
(10.78) 



where 



Hzw 
H-qw 



H, 



zp 

qp J 



w 
P 



The inequality (10.78) can be rewritten 



■ .1 u2 II — II 2 

I 1 1 rms "■" ||*/ 1 1 rms ^ II 1 1 rms ~r 



I rms' 



(10.79) 



Now assume that p = Aq, where A 6 A, so that these signals correspond to 
closed- loop behavior of the perturbed system, i.e., 



w. 



Z = \Hzw + H zp A (I - H qp A) Hgw) 

Since ||A||oo < 1, we have 

HPHrms _ ||*7||rms' 

Prom (10.79-10.81) we conclude that 

Pllrms = [Hzw + H^A (I - H qp A) H q -aJ 



(10.80) 
(10.81) 



w 
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Since this holds for any w, (10.77) follows. 

Doyle has interpreted the specification (10.76) as a small gain based robust 
stability condition (10.55) for a perturbed plant set that includes an unknown-but- 
bounded transfer matrix connected from z to w. This "performance loop" is shown 
in figure 10.20. 

If the condition (10.76) holds, then the closed-loop system in figure 10.20 is 
robustly stable for all A with ||A||oo < 1- In particular, the closed-loop system in 
figure 10.20 will be robustly stable for all A of the form 



A = 



A 

A perf 



with ||A||oo < 1 and ||A perf ||oo < 1- This is equivalent to the specification (10.75) 
holding robustly for all A with HAH,*, < 1. 
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Figure 10.20 Doyle's performance loop connects a feedback A per from 
the critical regulated variable z back to the critical exogenous input w. If 
the resulting system is robustly stable (with ||A per ||oo < 1 and ||A||oo < 
1), then the original system robustly satisfies the performance specification 



H, 



< 1. 



10.5.2 An Example 

We consider the plant perturbation set (10.10), i.e., frequency-dependent relative 
uncertainty in the transfer function Po, and the robustness specification (10.11) 
which limits the RMS gain from the reference input to the actuator signal to be less 
than or equal to 75, for all possible perturbations. 
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Applying the method above yields the convex specification 

<1, 



T/(75P ) 
T/(75P ) 



W Ie l_eiiT 
-W re l_errT' 



(10.82) 



which guarantees that the robust performance specification holds. 

The inner approximation (10.82) can be simplified by factoring T out of the 
matrix, and assuming T is stable, in which case it is equivalent to 



|T(;H | a 



1/{75P {JW)) -W re l_errlH 

l/(75P„(jw)) -W^^ijw) 



= \T{JU>)\ ^(IWreLerrUw)! 2 + |l/(75P (jw)) | 2 ) < 1 for W 6 R. 

Hence, (10.82) is equivalent to the specification that T be stable and satisfy the 
frequency-dependent magnitude limit 



\T(3")\ < 



VZflWrei^CjaOl 2 + |l/(75P (iw))| 2 ) 



for w 6 R, 



(10.83) 



which is plotted in figure 10.21. The reader should compare the specification (10.83) 
to (10.73), which guarantees robust stability, and is also plotted in figure 10.21. 
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Figure 10.21 If the closed-loop transfer function T is stable and its magni- 
tude lies below the limit shown, then the relative uncertainty in Pq cannot 
cause the perturbed closed-loop transfer function from reference input to 
actuator signal to have RMS gain exceeding 75. The dotted limit shows the 
previously determined bound that guarantees closed-loop stability despite 
the relative uncertainty in Po (see figure 10.19). The dashed line is the 
bound imposed by the performance specification ||T/Po||oo < 75. For this 
example, the specification of robust stability and nominal performance is 
not much looser than the small gain based inner approximation of robust 
performance. 
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Notes and References 

Comparison with Differential Sensitivity 

In [Lun89, P48-49], Lunze states: 

Sensitivity is a local property that describes how strongly the system perfor- 
mance is affected by very small perturbations around a given nominal point a. 
No information about the amount of perturbations is used. However, as many 
properties depend continuously on the parameter vector a, extrapolations can 
be made from very small to finite deviations. Therefore, sensitivity analysis 
yields guidelines for the attenuation of severe parameter perturbation and, 
hence, the achievement of robustness. But sensitivity analysis alone cannot 
ensure this robustness because the range of validity of the results is not known. 

(See also table 3.1 in [Lun89, p49].) 

The distinction between parametrized and other perturbed plant sets is not as clear as it 
might seem. In [Boy86], it is observed that robust stability specifications with unknown- 
but-bounded transfer function perturbations can be recast as robust stability specifications 
with parametrized plant perturbation sets. (There does not appear to be any advantage 
in doing so.) 

Relation to Classical Control Ideas 

Many of the small gain based approximations of robustness specifications that we have 
seen can be interpreted as limiting the magnitude of T or S. The idea that a closed-loop 
system with "large" T or S can be very sensitive to changes in Po is well-known in classical 
control; the specification ||T||oo < M is called an M-circle constraint. Horowitz [Hor.63, 
p148] states 

... it is not necessarily a useful practical system, if the locus [of L] passes very 
close to the —1 point. In the first place, a slight change of gain or time constant 
may sufficiently shift the locus so as to lead to an unstable system. In the 
second place, the closed-loop system response has 1 + L for its denominator. 
At those frequencies for which L is close to —1, 1 + L is close to zero, leading 
to large peaking in the system frequency response. 

A classical interpretation of the small gain specification ||<S||oo < a is that the Nyquist 
plot maintains a distance of at least 1/a from the critical point —1 (see [BBN90]). 

For a discussion of singular perturbations of control systems, see Kokotovic, Khalil, and 
O'Reilly [KKO86]. 

Small Gain Methods 

Small gain methods are discussed in the books by Desoer and Vidyasagar [DV75] and 
Vidyasagar [Vid78]. In most discussions the method is used to establish stability despite 
nonlinear and time- varying perturbations. 

The small gain theorem for linear A is a basic result of Functional Analysis, often attributed 
to Banach; see, e.g., Kantorovich and Akilov [KA82, p154]. Its use in the analysis of 
feedback systems was introduced by Zames [Zam66a]. 
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Many papers that discuss robustness specifications and small gain approaches are reprinted 
in the volume edited by Dorato [Dor.87]; see also the recent books by Lunze [Lun89, CH.8] 
and Maciejowski [Mac89, §3.10]. 

Conservatism of Small Gain Methods 

Since the small gain method yields inner approximations of robustness specifications, it is 
natural to ask how "conservative" these approximations can be. Doyle, Wall, Stein, Chen, 
and Desoer [DWS82, DS81, CD82b] observed that for the special case when 

A = {A | ||A||<„<M}, 

the small gain condition for robust stability, i.e., 

||H«p||oo < 1/M, 

is exactly equivalent to the robust stability specification, and not just an approximation. 
In such cases, therefore, the specification of robust stability is closed-loop convex. Thus 
in our example of robust stability despite relative uncertainty in Po, the small gain condi- 
tion (10.73) is the same as the robust stability specification. 

In other cases the approximation can be arbitrarily conservative; see for example the papers 
by Doyle [Doy82, Doy78] and Safonov and Doyle [SD84]. These papers suggest various 
ways this conservatism can be reduced, for example, by choosing an optimally-scaled norm 
for the small gain theorem. (We saw in our unmodeled plant dynamics example that scaling 
can affect the inner approximation produced by the small gain method.) But limits on the 
optimally-scaled gain are not closed-loop convex. 

In these papers, Doyle introduces the structured singular value; if the structured singular 
value is substituted for the norm-bound in the small gain theorem, then there is no con- 
servatism for robust stability problems with certain types of feedback perturbation sets. 
Specifications that limit the structured singular value are, however, not closed-loop convex. 

Small Gain Theorem and Lyapunov Stability 

Many of the specifications that we have encountered in this chapter have the form of a 
(possibly weighted) Hoo norm-bound on H qp . If such a specification is satisfied, then 
we can compute the positive definite solution X of the ARE (5.42) as we described in 
section 5.6.3. This matrix provides a Lyapunov function, V{z) = z T Xz, that proves that 
robust stability holds. See [BBK89] and [Wil73]. 

Circle Theorem 

Several of the small gain robustness specifications that we encountered can be interpreted 
as instances of the circle criterion, developed by Zames [Zam66b], Sandberg [San64], and 
Narendra and Goldwyn [NG64]. Multivariate versions were developed by Safonov and 
Athans [Saf80, SA81] and others. Part III of the collection edited by MacFarlane [Mac79] 
contains reprints of many of these original articles. 

Boyd, Barratt, and Norman [BBN90] show that a general circle criterion specification is 
closed-loop convex. 
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About the Examples 

The gain margin specification has been completely analyzed by Tannenbaum in [Tan80] 
and [Tan82]; his analysis shows that it is not closed- loop convex; indeed, the set of transfer 
matrices that satisfy a gain margin specification need not be connected. 

The convex inner approximation (10.71) of the generalized gain margin, which limits the 
shifted Hoo norm of S, is a special form of a generalized circle theorem (Moore [M0068]). 

The robust performance specification example turns out to be closed-loop convex, since it 
can be shown to be equivalent to 

\W lel _ elI {jw)T{jv)\ + |(l/75)T(jw)/P (jw)| < 1 for all w. 

In fact, for the 1-DOF control system, most robust performance specifications that are 
expressed in terms of weighted Hoo norm-bounds also turn out to be closed-loop convex. 
Another example is described in Francis [Fra88], and discussed in Boyd, Barratt and 
Norman [BBN90]. 

Robust Performance Method 

This method was introduced by Doyle in [Doy82, Doy78], and is discussed in Ma- 
ciejowski [Mac89, §3.12]. 

Describing Function Method 

The describing function method is described in [GV68] and [Vid78, Ch4]. Several mod- 
ifications can make the describing function method nonheuristic; see, e.g., Mees and 
Bergen [MB75]. 

An Extension of the Saturating Actuators Example 

We saw that it may be possible to design an LTI controller for a plant that is linear except 
for saturating actuators, in such a way that we can guarantee that the resulting nonlinear 
closed-loop system is stable, by requiring the specification (10.74) to be met. In this section 
we briefly describe an extension of this idea that has been very useful in practice, and is 
interesting because the method effectively synthesizes a nonlinear controller. 

The control system architecture is shown in figure 10.22. The nonlinear controller K nonlm 
consists of a two- input, one-output LTI system, with its output saturated and fed back to 
its first input; the output is identical to the signal that drives P lln . 

We redraw this control system as shown in figure 10.23, and consider the dead-zone non- 
linearity as a perturbation. We augment our design specifications with the small gain 
condition ||i? qp ||oo < 1; any resulting design is guaranteed to yield a stable nonlinear 
closed-loop system. (Indeed, by our comments above, we can produce a Lyapunov func- 
tion that proves stability of the closed-loop system.) 

This scheme is discussed in, for example, Astrom and Wittenmark [AW90], and Morari 
and Zafirou [MZ89, §3.2.3]. 
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Figure 10.22 The nonlinear controller x nonlln consists of a two-input, one- 
output LTI system, with its output saturated and fed back to its first input 
(this is the signal that drives P lln ). 
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Figure 10.23 The saturators in figure 10.22 are treated as a dead-zone 
nonlinearity perturbation to a linear system. 
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Chapter 11 

A Pictorial Example 



The sets of transfer matrices that satisfy specifications are generally infinite- 
dimensional. In this chapter we consider our standard example described in 
section 2.4 with an additional two-dimensional affine specification. This allows us 
to visualize a two-dimensional "slice" through the various specifications we have 
encountered. The reader can directly see, for this example, that specifications we 
have claimed are convex are indeed convex. 

Recall from section 2.4 that H^ a \ H^ h \ and H^ are the closed- loop transfer ma- 
trices resulting from the three controllers K^ a \ K^ h \ and K^ given there. The 
closed-loop affine specification 



n 



slice 



= {h \h = aH {a) + (3H {h) + (1 - a - /3)# (c) for some a, (3 6 r} 



requires H to lie on the plane passing through these three transfer matrices. The 
specification T^siice has no practical use, but we will use it throughout this chapter 
to allow us to plot two-dimensional "slices" through other (useful) specifications. 

Figure 11.1 shows the subset — 1 < a < 2, — 1 < (3 < 2 of 7^ s iice- Most plots that 
we will see in this chapter use this range. Each point in figure 11.1 corresponds to 
a closed-loop transfer matrix; for example, H^ corresponds to the point a = 1, 
(3 = 0, i?( b ) corresponds to the point a = 0, (3 = 1, and H^ corresponds to the 
point a = 0, (3 = 0. Also shown in figure 11.1 are the points 

0.6# (a) + 0.3# (b) + O.Lff (c) and - 0.2H^ - 0.6# (b) + 1.8# (c) . 

Each point in figure 11.1 also corresponds to a particular controller, although we 
will not usually be concerned with the controller itself. The controller that realizes 
the closed-loop transfer matrix 

aH^ + /3i? (b) + (1 - a - (3)H^ 

can be computed by two applications of equation (7.10) from chapter 7. 
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Figure 11.1 Each point (a, (5) corresponds to a closed-loop transfer matrix 
H that lies in the plane through i? (a) , H (b) , and H (c) . 

Many of the figures in this chapter show level curves of functionals on 7^ s iice> 



i.e. 



{ [a (3] T I <f> (aH& + (3H^ + (1 - a - (3)H^ = 7 } . 

For quasiconvex functionals this set might not be a connected curve, so we define 
the level curve to be the boundary of the convex set 

{ [a 0\ T I (/> (aHW + pH^ + (1 - a - p)H^ < 7 } . 

In most cases these two definitions agree. 



11.1 I/O Specifications 

11.1.1 A Settling Time Limit 

The step responses from the reference input r to y p for the three closed-loop systems 
are shown in figure 11.2. Figure 11.3 shows the level curves of the reference r to y p 
settling-time functional Settle) i-e., 



&ettie (aH[f + PH$ + (1 - a - (3)h[$) 



(11.1) 



Recall from section 8.1.1 that Settle) and therefore (11.1), are quasiconvex. From 
figure 11.3 it can be seen that the level curves bound convex subsets. 
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Figure 11.2 The step responses from the reference input, r, to plant out- 
put, 2/ p , for the closed-loop transfer matrices H^ a ' , H^ h \ and H^ c ' . 
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Figure 11.3 Level curves of the step response settling time, from the 
reference r to y p , given by (11.1). 
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11.1.2 Some Worst Case Tracking Error Specifications 

We now consider the tracking error. For our example, the tracking error transfer 
function is given by 



H 



13 



(we have not set up our example with the tracking error explicitly included in 
the regulated variables). Figure 11.4 shows the level curves of the weighted peak 
tracking error, i.e., 



V P k_trk(a, ff) = \w (aH$ + /3H<g> + (1 - a - /3)h[$ - l) 



, (11.2) 

pk_gn 



where the weight is 
1 



W{s) = 



2s + 1 



(We use the symbol ip to denote the restriction of a functional to a finite-dimensional 
domain.) Recall from section 8.1.2 that the weighted peak tracking error is a convex 
functional of H, and therefore v?pk_trk is a convex function of a and (3. From 
figure 11.4 it can be seen that the level curves bound convex subsets. 

Figure 11.5 shows the level curves of the peak tracking error, for reference inputs 
bounded and slew-rate limited by 1, i.e., 

aH$ + PH^ + (1 - a - 0)H$ - 1 . (11.3) 

wc 

In section 6.3.2 we showed that a function of the form (11.3) is convex; as expected, 
the level curves in figure 11.5 bound convex subsets of ^slice- 
In section 5.5.2 we showed that, for any transfer function H, 

II WHpk^n < ||tf|| wc < 3||W|| pk _ gn . 

The reader should compare the level curves in figures 11.4 and 11.5 with this relation 
in mind. 
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Figure 11.4 Level curves of the weighted peak tracking error, given 
by (11.2). 
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Figure 11.5 Level curves of the peak tracking error, for reference inputs 
bounded and slew- rate limited by 1, given by (11.3). 
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11.2 Regulation 

11.2.1 Asymptotic Rejection of Constant Disturbances 

We first consider the effect on y p of a constant disturbance applied to n proc . Fig- 
ure 11.6 shows the subset of 7^ s iice where such a disturbance asymptotically has no 
effect on y p , i.e., where the affine function 



aH$(0) +(3H[\\0) + (1 - a - flffjfto) 
vanishes. 



(11.4) 
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Figure 11.6 Asymptotic rejection of constant actuator-referred distur- 
bances on 2/ p . 



11.2.2 Rejection of a Particular Disturbance 

Suppose now that an actuator-referred disturbance is the waveform d paTt (t) shown 
in figure 11.7. Figure 11.8 shows the level curves of the peak output y p due to the 
actuator- referred disturbance d paTt , i.e., 



(aH$ + 0H™ + (1 - a - (3)h[?) d pait 



(11.5) 



which is a convex function on R 
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Figure 11.7 A particular actuator-referred process disturbance signal, 

Upart(i). 
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Figure 11.8 Level curves of the peak of y p due to the particular actuator- 
referred disturbance d pal t(t) shown in figure 11.7, given by (11.5). 
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11.2.3 RMS Regulation 

Suppose that n proc and n sensor are independent, zero-mean stochastic processes with 
power spectral densities 

SprocM = Wproc, 
<-'sensor(,' t 'J = "sensor) 

where 

Wproc = 0.04, 
^sensor = 0.01 

(i.e., scaled white noises). Figure 11.9 shows the level curves of the RMS value of 
y p due to these noises, i.e., the level curves of the function 

Vrms^ P (a, 0) = </>rms^ P (a# (a) + (3H^ + (1 - a - (3)H^) , (11.6) 

where 

A / 2 2\ 1//2 

0rms^rp(-£O = ( ||-ffll Wproc || 2 + 11-^12 W sen sor || 2 ) ■ (H-?) 

Recall from section 8.2.2 that the RMS response to independent stochastic inputs 
with known power spectral densities is a convex functional of H; therefore <f) Tms _ yp 
is a convex function of H, and v? r ms^yp is a convex function of a and (3. 

11.3 Actuator Effort 

11.3.1 A Particular Disturbance 

We consider again the particular actuator-referred disturbance d paTt (t) shown in 
figure 11.7. Figure 11.10 shows the peak actuator signal u due to the actuator- 
referred disturbance d paTt , i.e., 

II (offW + 0H™ + (1 - a - 0)H%>) d pait , (11.8) 

II \ / oo 

which is a convex function on R 2 . 

11.3.2 RMS Limit 

Figure 11.11 shows the level curves of the RMS value of u due to the noises described 
in section 11.2.3, i.e., the level curves of the function 

</>rms_u (a# (a) + /3# (b) + (1 - a - /3)ff («=)) , (11.9) 
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Figure 11.9 Level curves of the RMS value of y p , with sensor and actuator 
noises, given by (11.6). 



<33_ 0.5 




Figure 11.10 Level curves of the peak actuator signal u, due to the par- 
ticular actuator-referred disturbance d part (t) shown in figure 11.7, given 
by (11.8). 
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where 



A / 2 2\ 

</>rms_u(-£0 = ( ||-ff21^proc|| 2 + 11-^22 ^sensor || 2 J 



1/2 



(11.10) 



Recall from section 8.2.2 that the RMS response to independent stochastic inputs 
with known power spectral densities is a convex functional of H; therefore (/> rm s_u 
is a convex function of H, and (11.9) is a convex function of a and (3. 
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Figure 11.11 Level curves of the RMS value of the actuator signal u, with 
sensor and actuator noises, given by (11.9). 



11.3.3 RMS Gain Limit 

Figure 11.12 shows the level curves of the worst case RMS actuator signal u for any 
reference input r with RMS value bounded by 1, i.e., 



aH& + /3h£> + (1 - a - /3)i?, (c) 



'23 



'23 



(11.11) 



11.3 Actuator Effort 



259 



<CL 




Figure 11.12 Level curves of the worst case RMS actuator signal u for 
any reference input r with RMS value bounded by 1, given by (11.11). 
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11.4 Sensitivity Specifications 

11.4.1 A Log Sensitivity Specification 

We consider the plant perturbation 

8P* td (s)= 7 P* td (s), 

i.e., a gain variation in Pg td (see section 9.1.3). Figure 11.13 shows the level curves of 
the maximum logarithmic sensitivity of the magnitude of the I/O transfer function 
Hi 3 , over the frequency range < w < 1, to these gain changes, i.e., 



sup 

0<u><! 



d 



07 



log \H 13 (jw)\ 



7=0 



= max m{ju)\, 

0<u><1 



(11.12) 



where 



5(j«) = 1 - (aff#(j W ) + /H rft ) (jw) + (i _ Q - /^^(ja,)) . 
As expected, the level curves in figure 11.13 bound convex subsets of 7^ s iice- 



Q3. 0.5 




Figure 11.13 Level curves of the logarithmic sensitivity of the magnitude 
of the I/O transfer function H13, over the frequency range < w < 1, to 
gain changes in the plant Po td , given by (11.12). 

When the function (11.12) takes on the value 0.3, the maximum first order 
change in \Hi3(jtv)\, over < w < 1, with a 25% plant gain change is exp(0.075), 
or 0.65dB. In figure 11.14 the actual maximum change in \H\z{jw)\ is shown for 
points on the 0.3 contour of the function (11.12). 
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Figure 11.14 To first order, the peak change in |.ffi3(ju;)| for < w < 1 
along the 0.3 contour in figure 11.13, for a 25% gain change in Po td , will be 
0.65dB. The 0.3 contour from figure 11.13 is shown, together with the actual 
peak change in |.ffi3(jw)| for < u) < 1 for several points on the contour. 



11.4.2 A Step Response Sensitivity Specification 

In section 9.3 we considered the sensitivity of the I/O step response at t = 1 to 
plant gain changes, i.e., 6Pg td = 7Pg td : 



Mi) = 



A ds(l) 



dj 



7=0 



Figure 11.15 shows the subset of T^siice for which 

|s 7 (l)| < 0.75. 
This specification is equivalent to 
(1 - T{ju))T(jw) 



— f 
2W-, 



-e> w dw 



ju 



< 0.75, 



(11.13) 



where 



r( a )/ 



r(b) 



T{jw) = aH™(jw) + 0H£>(j U ) + (1 - a - 0)H™{j U 



(<0, 



As we showed in section 9.3, and as is clear from figure 11.15, the step response 
sensitivity specification (11.13) is not convex. 
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<cl 




Figure 11.15 The subset of 7Y s iice that has an I/O step response sensi- 
tivity magnitude, at t = 1, of less than 0.75. This is the set of points for 
which (11.13) holds. 



11.5 Robustness Specifications 

11.5.1 Gain Margin 

We now consider the gain margin specification D+4,-3.5db_gm: the system should 
remain stable for gain changes in Pg td between +4dB and — 3.5dB. In section 10.4.4 
we used the small gain theorem to show that 

imioo < i.7i (ii.i4) 

HSHoo < 2.02 (11.15) 

are (different) inner approximations of the gain margin specification 2>+4,-3.5db_gm- 

Figure 11.16 shows the subset of 7^ s iice that meets the gain margin specification 
D + 4 ) _3.5db_gm) together with the two inner approximations (11.14-11.15), i.e., 



\\ang + 0H$ + {1 - a - 0)Hg 
(aH& + /3H%> + {1 - a - 0)H$) 



< 1.71 

< 2.02. 



(11.16) 

(11.17) 



In general, the specification D+4,-3.5db_gm is not convex, even though in this case 
the subset of 7^ s iice that satisfies D+4,-3.5db_gm is convex. The two inner approx- 
imations (11.16-11.17) are norm-bounds on H, and are therefore convex (see sec- 
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tion 6.3.2). The two approximations (11.16-11.17) are convex subsets of the exact 
region that satisfies D+4,-3.5db_gm- 
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Figure 11.16 The boundary of the region where the gain margin specifi- 
cation I'+4,-3.5db_gm is met is shown, together with the boundaries of the 
two inner approximations (11.16-11.17). In this case the exact region turns 
out to be convex, but this is not generally so. 

The bound on the sensitivity transfer function magnitude given by (10.69) is an 
inner approximation to each gain margin specification. Figure 11.17 shows the level 
curves of the peak magnitude of the sensitivity transfer function, i.e., 



H>. 



max_sens 



(«, /3) = 



1 - (aH& + 0hM + {1 - a - 0)H$) 



(11.18) 



In section 6.3.2 we showed that a function of the form (11.18) is convex; the level 
curves in figure 11.17 do indeed bound convex subsets of 7^ s iice- 



11.5.2 Generalized Gain Margin 

We now consider a tighter gain margin specification than 2>+4,-3.5db_gm: for plant 
gain changes between +4dB and — 3.5dB the stability degree exceeds 0.2, i.e., the 
closed- loop system poles have real parts less than —0.2. In section 10.4.4 we used 
the small gain theorem to show that 



11511^,0.2 < 2.02 
is an inner approximation of this generalized gain margin specification. 



(11.19) 
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Figure 11.17 The level curves of the sensitivity transfer function magni- 
tude, given by (11.18). 



Figure 11.18 shows the subset of 7^ s iice that meets the generalized gain margin 
specification, together with the inner approximation (11.19), i.e., 



1 - (aHg + 0Hg> + {1 - a - 0)H<$) 



< 2.02. 



oo,0.2 



(11.20) 



The generalized gain margin specification is not in general convex, even though in 
this case the subset of 7^ s iice that satisfies the generalized gain margin specification 
is convex. The inner approximation (11.20) is a norm-bound on H, and is therefore 
convex (see section 6.3.2). 

11.5.3 Robust Stability with Relative Plant Uncertainty 

Figure 11.19 shows the subset of 7^ s iice that meets the specification 

V Toh ^ tah {T), (11-21) 

where V is the plant perturbation set (10.10), i.e., robust stability with the relative 
plant uncertainty Wrei_err described in section 10.2.3. The specification (11.21) is 
equivalent to the convex inner approximation 



W rel _ err (aH$ + 0H™ + (1 - a - /3)i?^ ) ) 



< 1 



(11.22) 



derived from the small gain theorem in section 10.4.4, i.e., in this case the small 
gain theorem is not conservative (see the Notes and References in chapter 10). 
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Figure 11.18 The boundary of the exact region where the generalized gain 
margin specification is met is shown, together with the boundary of the inner 
approximation (11.20). This specification is tighter than the specification 
X' + 4 i _3.5db_gm shown in figure 11.16. 



<Q. 




Figure 11.19 The region where the robust stability specification (11.21) is 
met is shaded. This region is the same as the inner approximation (11.22). 
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11.5.4 Robust Performance 

Figure 11.20 shows the subset of T^siice that meets the specification 

ZWTMI-Haslloo < 75 )> 



(11.23) 



where V is the plant perturbation set (10.10), i.e., the plant perturbations described 
in section 10.2.3 never cause the RMS gain from the reference input to the actuator 
signal to exceed 75. In section 10.5.2 we showed that an inner approximation of the 
specification (11.23) is 



r(a), 



r00f 



Mu, 



aH\ a 3 >(jw)+pH\ D 3 >(jw) + (l-a-p)H\ c 3 , (jw) < l(w) for w € R (11.24) 

(note that H 23 = H 13 /P$ td ), where 

1 



J( W ) = 



^2 {\W iel _ erT (jiv)\* + |l/(75P std (jo;))P) ' 



which is also shown in figure 11.20. The exact region is not in general convex, 
although in this case it happens to be convex (see the Notes and References in 
chapter 10). 
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Figure 11.20 The boundary of the exact region where the robust perfor- 
mance specification (11.23) is met is shown, together with the inner approx- 
imation (11.24). 

Figure 11.21 shows the exact specification (11.23), together with the convex inner 
approximation (11.24) and a convex outer approximation (i.e., a specification that 
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is weaker than (11.23)). The outer approximation is the simultaneous satisfaction 
of the specifications 



^robjtab(^) and H-ffaslloo < 75, 



(11.25) 



described in sections 11.5.3 and 11.3.3 respectively. The specifications (11.25) re- 
quire robust stability, and that the nominal system has an RMS gain from the 
reference to actuator signal not exceeding 75. The robust performance specifica- 
tion (11.23) therefore implies (11.25), so (11.25) is an outer approximation of the 
robust performance specification (11.23). The outer approximation in figure 11.21 
is the set of a, (3 for which 



W Tel ^ (aH$ + PH™ + (1 - a - 0)h[$) 

< 75, 



<1, 



aHif + PH™ + (1 - a - (3)H. 



(c) 
23 



(11.26) 



(see figure 10.21). 
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Figure 11.21 The boundary of the exact region where the robust perfor- 
mance specification (11.23) is met is shown, together with the inner approx- 
imation (11.24) and the outer approximation (11.26). The outer approxi- 
mation is the intersection of the nominal performance specification and the 
robust stability specification (see figures 11.12 and 11.19). 
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11.6 Nonconvex Design Specifications 

11.6.1 Controller Stability 

Figure 11.22 shows the subset of 7^ s ii ce that is achieved by an open-loop stable 
controller, i.e., 



{[« 



P? 



aH(*) + 0jj(b) + (! _ a _ /3)ij(<=) is 
achieved by a stable controller K 



(11.27) 



From figure 11.22, we see that (11.27) is a nonconvex subset of "W s iice- We conclude 
that a specification requiring open-loop controller stability, 



-{> 



"^k_stab — \H 

is not in general convex 



H = P ZW + P ZU K(I - P yu K) 
for some stable K that stabilizes 



-i p 1 
izes P J ' 




a 



Figure 11.22 Region where the closed-loop transfer matrix H is achieved 
by a controller that is open-loop stable. It is not convex. 



11.7 A Weighted-Max Functional 



Consider the functional 



Vwt_max(a,/3) =max{v? pk _ trk (a:,/3), 0.5<p 



max_sens 



{a, (3), 15v? rms ^ p (a, (3)} , 
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where the functions v? p k_trk, Vmax^ens, and <p im s_y P are given by (11.2), (11.18), 
and (11.6). The level curves of the function v? w t_max(a> P) are shown in figure 11.23. 
The function v?wt_max will be used for several examples in chapter 14. 
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Figure 11.23 The level curves of ^wt_max(ct, /3). 
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Notes and References 

How These Figures Were Computed 

Most level curves of convex functions were plotted using a radial bisection method. Con- 
sider the problem of plotting the <p{H) = 7 level curve on 7Y s iice, where <f> is convex. Assume 
that we know some point (ao, 0o) inside this level curve, i.e. 

4> (a H M + o H (h) + (1 - a - 0o)H (c) ) < 7. 
The value of <f> along the radial line segment 

a = a +Acos(6>), = 0o + Asin(6>), (11.28) 

where A > 0, is 

<pe(\) = <f> ((a + Acos(0)) (H (&) - H (c) ) + (0 O + Asin(0)) (H {h) - H {c) ) + H {c) ) . 

For each 0, (p$ is a convex function from R + to R with <pe(0) < 7, so there is no more 
than one A > for which 

<peW=J- ( n - 29 ) 

(11.29) can be solved using a number of standard methods, such as bisection or regula 
falsi, with only an evaluation of <pe required at each iteration. As 6 sweeps out the angles 
< 6 < 27r, the solution A to (11.29), together with (11.28), sweeps out the desired level 
curve. This method was used for figures 11.4, 11.5, 11.12, 11.13, 11.17, 11.19, 11.20, 11.21, 
and 11.23. 

The above method also applies to quasiconvex functionals with the following modification: 
in place of (11.29) we need to find the largest A for which ^e(A) < 7. 

In certain cases (11.29), or its quasiconvex modification, can be solved directly. For ex- 
ample, consider the quasiconvex settling-time functional ^settle from section 11.1.1. For a 
fixed 0, the step response along the line (11.28) is of the form 

s {t) + \ Sl {t), 

where so is the step response of 

aoH^ + O H ( * ) + (1 - a - 0o)H^ , 
and «i is the step response of 

cos(6») J H"<3 ) + sin(6») J ff 1 ( 3 ) + (-cos(0) - sin^))^ . 

The largest value of A for which <^e(A) < T max is given by 

A* = max A. 

0.95 < s {t) + Asi(t) < 1.05 for t > T max 
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This is a linear program in the scalar variable A that can be directly solved. 

A similar method was used to produce figures 11.8 and 11.10. A similar method could 
also be used to produce level curves of Hoc norms; at each frequency ui a quadratic in A 
can be solved to find the positive value of A that makes the frequency response magnitude 
tight at u). Taking the minimum A over all w gives the desired A*. 

Figures 11.9 and 11.11 were plotted by directly computing the equations of the level 
curves using a state-space method. For example, consider ||_Hi2|||, which is one term in 
the functional ^rmsjrp. Since 

aH^ + (3H { £> + (1 - a - 0)H$ = H (c) + a (tf (a) - H {c) ) + (3 (H {a) - H {c) ) 

is affine in a and f3, it has a state-space realization 

C(sl - A)-\B + aSi + (3B 2 ), 

where C is a row vector, and Bo, B\ and B 2 are column vectors. From section 5.6.1, if 
Wobs is the solution to the Lyapunov equation 

A T W ohs + Wobs A + C T C = 0, 
we have 



r(«) 



r(b) 



aH™ + PH™ + {1 - a - 0)H\ 



(c) 



= [ 1 a (3 ] E 



1 

a 



where 



E = 



Bl 



W ohs [ B Si B 2 ] 



is a positive definite 3x3 matrix. The level curves of ||.Hi2||2 on H s u C e are therefore 
ellipses. 

The convex sets shown in figures 11.16 and 11.18 were produced using the standard radial 
method described above. The exact contour along which the gain margin specification 
was tight was also found by a radial method. However, since this specification need not 
be convex, a fine grid search was also used to verify that the entire set had been correctly 
determined. 

The step response sensitivity plot in figure 11.15 was produced by forming an indefinite 
quadratic in a and f3. The sensitivity, from (11.13), is the step response of 

(l - (aH[f + pH[ b 3 ] + (1 - a - flff}?)) (aH[f + (3H^> + (1 - a - p)H$) 

at t = 1. After expansion, the step response of each term, at t = 1, gives each of the 
coefficients in an indefinite quadratic form in a and (3. Figure 11.15 shows the a, (3 for 
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which 



-0.75 < [ 1 a (3 ] 



0.7149 


-0.0055 


0.1180 


0.0055 


-0.0074 


0.0447 


0.1180 


0.0447 


-0.2493 



1 

a 

P 



< 0.75. 



The controller stability plot in figure 11.22 was produced by finding points where the 
transfer function S(ju>)/u> 2 vanished for some frequency u>. Since S = 1/(1 + Pg td K) 
vanishes wherever Pg td or K has a ju> axis pole, the ju> axis poles of K are exactly the 
juj axis zeros of S(jw)/w 2 ; the factor of u> 2 cancels the two zeros at s = that S inherits 
from the two s = poles of Po td . At each frequency u>, the linear equations in a and (3 

» (l- (aH[l\ju)+/3H£\ju) + (l-a-P)H['?(ju>))yu> 2 = 0, 

3 (l- (offWOw) +K ] 0'«) + (1 " a -^)4'W))/" 2 = 

may be dependent, independent, or inconsistent; their solution in the first two cases gives 
either a line or point in the (a, (3) plane. When these lines and points are plotted over 
all frequencies they determine subsets of 7Y s iice over which the controller K has a constant 
number of unstable (right half-plane) poles. By checking any one controller inside each 
subset of T^siice for open- loop stability, each subset of 7Y s iice can be labeled as being achieved 
by stable or unstable controllers. 



Part IV 



NUMERICAL METHODS 
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Chapter 12 

Some Analytic Solutions 



We describe several families of controller design problems that can be solved 
rapidly and exactly using standard methods. 



12.1 Linear Quadratic Regulator 

The linear quadratic regulator (LQR) from optimal control theory can be used to 
solve a family of regulator design problems in which the state is accessible and regu- 
lation and actuator effort are each measured by mean-square deviation. A stochastic 
formulation of the LQR problem is convenient for us; a more usual formulation is 
as an optimal control problem (see the Notes and References at the end of this 
chapter). The system is described by 

x = Ax + Bu + w, 

where w is a zero-mean white noise, i.e., w has power spectral density matrix 
S w (tv) = I for all w. The state x is available to the controller, so y = x in our 
framework. 

The LQR cost function is the sum of the steady-state mean-square weighted 
state x, and the steady-state mean-square weighted actuator signal u: 

Ji qr = lim E (x(t) T Qx(t) +u(t) T Ru(t)) , 

t— >oo 

where Q and R are positive semidefinite weight matrices; the first term penalizes 
deviations of x from zero, and the second term represents the cost of using the 
actuator signal. We can express this cost in our framework by forming the regulated 
output signal 



z = 



R*u 
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so that 

Ji qr = lim Ez(t) T z(t), 

t— >oo 

the mean-square deviation of z. Since w is a white noise, we have (see section 5.2.2) 

t II tj"||2 

"lqr — ||" || 2) 

the square of the H 2 norm of the closed-loop transfer matrix. 

In our framework, the plant for the LQR regulator problem is given by 

A P =A 
B u = B 
B,„ = / 





C z = 

Cy=I 

D zw =0 

D zu = 



Qi 



Ri n 



■Uyw — " 

D yu = 



(the matrices on left-hand side refer to the state-space equations from section 2.5). 
This is shown in figure 12.1. 



w 




Figure 12.1 The LQR cost is \\H\\\. 



The specifications that we consider are realizability and the functional inequality 
specification 

ll-fflla < ol. (12.1) 



12.1 Linear Quadratic Regulator 
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Standard assumptions are that (Q,A) is observable, (A, B) is controllable, and 
R > 0, in which case the specification (12.1) is stronger than (i.e., implies) internal 
stability. (Recall our comment in chapter 7 that internal stability is often a redun- 
dant addition to a sensible set of specifications.) With these standard assumptions, 
there is actually a controller that achieves the smallest achievable LQR cost, and it 
turns out to be a constant state-feedback, 

-Klqr(s) = -K sib , 

which can be found as follows. 

Let Xi qi denote the unique positive definite solution of the algebraic Riccati 
equation 

A T X lqT + X lqv A - X lqT BR~ 1 B T X lqT + Q = 0. (12.2) 

One method of finding this _X"i qr is to form the associated Hamiltonian matrix 

n r A — BR B ,„„ „. 

M =[-Q -A? J' < 12 - 3 ) 

and then compute any matrix T such that 

in A 12 
i 22 



T~ X MT = 



T = 



where An is stable. (One good choice is to compute an ordered Schur form of M 7 ; 
see the Notes and References in chapter 5.) We then partition T as 

Tn T12 
Tii Tii 

and the solution Xi qi is given by 
Xi qi = -Lii-l-ii ■ 

(We encountered a similar ARE in section 5.6.3; this solution method is analogous 
to the one described there.) 

Once we have found -X"i qr , we have 

-Ksfb = R B -X"i qr , 

which achieves LQR cost 

J\ qT = Tr-^lqr- 

In particular, the specification (12.1) (along with realizability) is achievable if and 
only if a > y^TrXi^, in which case the LQR-optimal controller K\ qi achieves the 
specifications. 

In practice, this analytic solution is not used to solve the feasibility problem for 
the one-dimensional family of specifications indexed by a; rather it is used to solve 
multicriterion optimization problems involving actuator effort and state excursion, 
by solving the LQR problem for various weights R and Q. This is explained further 
in section 12.2.1. 
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12.2 Linear Quadratic Gaussian Regulator 

The linear quadratic Gaussian (LQG) problem is a generalization of the LQR prob- 
lem to the case in which the state is not sensed directly. For the LQG problem we 
consider the system given by 



x = Ax + Bu + w 
y = Cx + v sen 



proc 



where the process noise w pTOC and measurement noise v s 



(12.4) 

(12.5) 

. are independent and 
have constant power spectral density matrices W and V, respectively. 

The LQG cost function is the sum of the steady-state mean-square weighted 
state x, and the steady-state mean-square weighted actuator signal u: 

Ji qg = lim E (x{t) T Qx{t) + u{t) T Ru{t)) , (12.6) 

t — >oo 

where Q and R are positive semidefinite weight matrices. 

This LQG problem can be cast in our framework as follows. Just as in the 
LQR problem, we extract the (weighted) plant state x and actuator signal u as the 
regulated output, i.e., 



z = 



R?u 
Qzx 

The exogenous input consists of the process and measurement noises, which we 
represent as 



^proc 
^sensor 






w, 



with w a white noise signal, i.e., S w (w) 
plant for the LQG problem is thus 

A P =A 

B W = [W* ] 

_B„ = B 


Qi 



c z = 



Cy — C 



U zw — 

D zu = 






R* 





Dy W = [ V* ] 



D yu = 0. 



= /. The state-space description of the 

(12.7) 
(12.8) 
(12.9) 

(12.10) 

(12.11) 

(12.12) 

(12.13) 

(12.14) 
(12.15) 
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w 



This is shown in figure 12.2. 

; p 

{ — ' 



u 
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1/2 
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process 
noise 



R 



1/2 



(si-Ay* 



C 



rl/2 



measurement noise 



Q 



1/2 



•i±_ 



K 



by 



Figure 12.2 The LQG cost is ||.ff|||. 
Since w is a white noise, the LQG cost is simply the variance of z, which is given 



Jlqg = \\H\ 



The specifications for the LQG problem are therefore the same as for the LQR 
problem: realizability and the H 2 norm-bound (12.1). 

Standard assumptions for the LQG problem are that the plant is controllable 
from each of u and w, observable from each of z and y, a positive weight is used 
for the actuator signal (R > 0), and the sensor noise satisfies V > 0. With these 
standard assumptions in force, there is a unique controller Ki qs that minimizes 
the LQG objective. This controller has the form of an estimated-state-feedback 
controller (see section 7.4); the optimal state-feedback and estimator gains, -fC s fb 
and -Lest) can be determined by solving two algebraic Riccati equations as follows. 
The state-feedback gain is given by 



K s{b = R~ 1 B T X 



iqg) 



(12.16) 



where -X"i qg is the unique positive definite solution of the Riccati equation 

A T X lqg + X lqg A - X lqg BR~ 1 B T X lqg + Q = 0, (12.17) 

which is the same as (12.2). The estimator gain is given by 

Lest = Y lqg C T V-\ (12.18) 
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where Yi qg is the unique positive definite solution of 

AY lqg + Y lqg A T - y lqg c T y- 1 cr lqg + w = o (12.19) 

(which can be solved using the methods already described in sections 12.1 and 5.6.3). 
The LQG-optimal controller K\ qg is thus 

K lqg {s) = -K sib {sl -A + BK s{b + LestCT^est, (12.20) 

and the optimal LQG cost is 

J,* qg = Tr (X lqg W + QY lqg + 2X lqg AY lqg ) . (12.21) 

The specification ||.ff||2 < ol (along with readability) is therefore achievable if and 
only if a > \/J{ qg , in which case the LQG-optimal controller K\ qg satisfies the 
specifications. 



12.2.1 Multicriterion LQG Problem 

The LQG objective (12.6) can be interpreted as a weighted-sum objective for a 
related multicriterion optimization problem. We consider the same system as in the 
LQG problem, given by (12.4-12.5); the objectives are the variances of the actuator 

signals, 



Ur, 



|2 
Irms) 



ll^lllrms) • 

and some critical variables that are linear combinations of the system state, 



\ c i x \ 



Cm. *k 



where the process and measurement noises are the same as for the LQG problem 
(ci, . . . , Cm are row vectors that determine the critical variables). 

We describe this multicriterion optimization problem in our framework as fol- 
lows. We use the same plant as for the LQG problem, substituting 



z = 

C\X 

c m x 

for the regulated output used there. The state-space plant equations for the multi- 
criterion LQG problem are therefore given by (12.7-12.15), with 





C z = 



Cl 
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substituted for (12.10) and 
D„ t = 



' ZU 



I 





substituted for (12.13). The objectives are given by the squares of the H 2 norms of 
the rows of the closed-loop transfer matrix: 

MH) = \\H^\\l 

where H^ is the ith row of H and L = n z = n u + m. The hard constraint for this 
multicriterion optimization problem is realizability. 

For 1 < i < n u , <fii(H) represents the variance of the ith actuator signal, and for 
n u + 1 < i < L, (f>i(H) represents the variance of the critical variable Ci_ nu x. The 
design specification 

(t>i{H)< ai ,...,(t> L {H)<a L , (12.22) 

therefore, limits the RMS values of the actuator signals and critical variables. 

Consider the weighted-sum objective associated with this multicriterion opti- 
mization problem: 

</>wt_sum(# ) = \<t>i{H) + ■■■ + *l4>l{H), (12.23) 

where A > 0. We can express this as 

<Pwt_sum(.-" ) = •'lqg 

if we choose weight matrices 

Q = X nu+1 c^c 1 + --- + X L clc m , (12.24) 

i? = diag(A 1 ,...,A m ) (12.25) 

(diag (•) is the diagonal matrix with diagonal entries given by the argument list). 

Hence by solving an LQG problem, we can find the optimal design for the 
weighted-sum objective for the multicriterion optimization problem with functionals 
0i j ■ ■ ■ i 4>l ■ These designs are Pareto optimal for the multicriterion optimization 
problem; moreover, because the objective functionals and the hard constraint are 
convex, every Pareto optimal design arises this way for some choice of the weights 
Ai, . . . , Ai (see section 6.5). Roughly speaking, by varying the weights for the LQG 
problem, we can "search" the whole tradeoff surface. 

We note that by solving an LQG problem, we can evaluate the dual function i\) 
described in section 3.6.2: 

■0(A) = min {X^^H) -\ h X L (f) L (H) \ H is realizable } 

— T* 

given by (12.21), using the weights (12.24-12.25). We will use this fact in sec- 
tion 14.5, where we describe an algorithm for solving the feasibility problem (12.22). 
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12.3 Minimum Entropy Regulator 

The LQG solution method described in section 12.2 was recently modified to find 
the controller that minimizes the 7-entropy of H, defined in section 5.3.5. Since 
the 7-entropy of H is finite if and only if its H^ norm is less than 7, this analytic 
solution method can be used to solve the feasibility problem with the inequality 
specification ||-ff||oo < 7- 

The plant is identical to the one considered for the LQG problem, given by (12.7- 
12.15); we also make the same standard assumptions for the plant that we made 
for the LQG case. The design specifications are realizability and the H,^ norm 
inequality specification 

pT||oo<7 (12-26) 

(which are stronger than internal stability under the standard assumptions). We 
will show how to solve the feasibility problem for this one- dimensional family of 
design specifications. 

It turns out that if the design specification (12.26) (along with realizability) is 
achievable, then it is achievable by a controller that is, except for a scale factor, 
an estimated-state-feedback controller. This controller can be found as follows. If 
7 is such that the specification (12.26) is feasible, then the two algebraic Riccati 
equations 

A T X me + X me A - X iae (BR~ 1 B T - 1 ~ 2 W)X me + Q = (12.27) 

(c.f. (12.17)), and 

AY me + Y me A T - y me (C T y- 1 C - -y~ 2 Q)Y me + W = (12.28) 

[c.f. (12.19)) have unique positive definite solutions A" me and Y me , respectively. (The 
mnemonic subscript "me" stands for "minimum entropy".) These solutions can be 
found by the method described in section 12.1, using the associated Hamiltonian 
matrices 



4 


-{BR~ 1 B T - 7 " 2 W) " 




A 


-W 


Q 


-A T 


> 


_ -(C T F- 1 C-7- 2 Q) 


-A T 



if either of these matrices has imaginary eigenvalues, then the corresponding ARE 
does not have a positive definite solution, and the specification (12.26) is not feasible. 
Prom A" me and Y me we form the matrix 

*me(/-7~ 2 ^me*me)-\ (12.29) 

which can be shown to be symmetric. If this matrix is not positive definite (or the 
inverse fails to exist), then the specification (12.26) (along with realizability) is not 
feasible. 
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If, on the other hand, the positive definite solutions A" me and 1^ exist, and the 
matrix (12.29) exists and is positive definite, then the specification (12.26) (along 
with realizability) is feasible. Let 

K sfb = R~ 1 B T X me (I - T-^melme)" 1 (12.30) 

and 

L e st = Y^^V- 1 (12.31) 

(c./. (12.16) and (12.18)). A controller that achieves the design specifications is 
given by 

K me {s) = -K sfb (sI-A + BK sfb + L est C - ^Y^Q)' 1 L est 
(c.f. the LQG-optimal controller (12.20)). 

12.4 A Simple Rise Time, Undershoot Example 

In this section and the next we show how to find explicit solutions for two specific 
plants and families of design specifications. 

We consider the classical 1-DOF system of section 2.3.2 with 

Po(s)= S ~ 1 



(s + 1) 2 

It is well-known in classical control that since Po has a real unstable zero at s = 1, 
the step response from the reference input r to the system output y p , Si3(<), must 
exhibit some undershoot. We will study exactly how much it must undershoot, when 
we require that a stabilizing controller also meet a minimum rise-time specification. 
Our design specifications are internal stability, a limit on undershoot, 

4>us{H 13 ) < C/ max , (12.32) 

and a limit on rise time, 

<f>rise{H 13 ) < ^max- (12.33) 

Thus we have a two-parameter family of design specifications, indexed by U ma ^ and 
T 

J-max' 

These design specifications are simple enough that we can readily solve the 
feasibility problem for each U max and T max . We will see, however, that these design 
specifications are not complete enough to guarantee reasonable controller designs; 
for example, they include no limit on actuator effort. We will return to this point 
later. 
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We can express the design specification of internal stability in terms of the 
interpolation conditions (section 7.2.5) for T, the I/O transfer function: T is stable 
and satisfies 

T(l) = T(oo) = 0. (12.34) 

This in turn can be expressed in terms of the step response Si3(t): S13 is the step 
response of a stable transfer function and satisfies 

y»00 

/ si3(*)e"*cft = 0, (12.35) 

Jo 

s 13 (0) = 0. (12.36) 

Now if (12.33) holds then 

/ SisWe"* dt > 0.8 / e _t dt = 0.8e~ T ™*. (12.37) 

If (12.32) holds then 

p ■*■ max p ■*■ max 

/ «i 3 (*)e-* dt > -*7 max / c-* dt = -t/ max (1 - c- T -« ) . (12.38) 

Jo Jo 

Adding (12.37) and (12.38) we have 

y»CO 

= / «is(*)e-* dt > 0.8c- T -» - t/ max (l - e- T -" ) . 

Hence if the design specifications with ?7 max and T max are feasible, 
0.8e _Tma * 



- 1 1 



1-e" 



max 



This relation is shown in figure 12.3. We have shown that every achievable under- 
shoot, rise-time specification must lie in the shaded region of figure 12.3; in other 
words, the shaded region in figure 12.3 includes the region of achievable specifica- 
tions in performance space. 

In fact, the specifications with limits U max and T max are achievable if and only 
if 

0.8e _Tmax 
*7max > z^ — , (12-39) 

J^ — £ ■*■ max 

so that the shaded region in figure 12.3 is exactly the region of achievable specifi- 
cations for our family of design specifications. 

We will briefly explain why this is true. Suppose that ?7 max and T max sat- 
isfy (12.39). We can then find a step response Si3(t) of a stable rational transfer 
function, that satisfies the interpolation conditions (12.35-12.36) and the overshoot 
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0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 

T 

J-max 

Figure 12.3 The tradeoff between achievable undershoot and rise-time 
specifications. 



and undershoot limits. If U ma ^ and T max are near the boundary of the region of 
achievable specifications, this step response will have to "hug" (but not violate) 
the two constraints. For U max = 0.70 and T max = 1.0 (marked "X" in figure 12.3) 
a suitable step response is shown in figure 12.4; it is the step response of a 20th 
order transfer function (and corresponds to a controller K of order 22). (A detailed 
justification that we can always design such a step response is quite cumbersome; 
we have tried to give the general idea. See the Notes and References at the end of 
this chapter for more detail about this particular transfer function.) 

The rapid changes near t = and t = 1 of the step response shown in figure 12.4 
suggest very large actuator signals, and this can be verified. It should be clear that 
for specifications U ma *, T mai that are nearly Pareto optimal, such rapid changes in 
the step response, and hence large actuator signals, will be necessary. So controllers 
that achieve specifications near the tradeoff curve are probably not reasonable from 
a practical point of view; but we point out that this "side information" that the 
actuator signal should be limited was not included in our design specifications. The 
fact that our specifications do not limit actuator effort, and therefore are probably 
not sensible, is reflected in the fact that the Pareto optimal specifications, which 
satisfy 



Uma.Tc — 



0.8e" 



are not achievable (see the comments at the end of section 3.5). 
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Figure 12.4 A step response with an undershoot of 0.70 and a rise time of 
1.0, which achieves the specifications marked "X" in figure 12.3. Undershoot 
as small as 0.466 with a rise time of 1.0 are also achievable. 



The tradeoff curve in figure 12.3 is valuable even though the design specifications 
do not limit actuator effort. If we add to our design specifications an appropriate 
limit on actuator effort, the new tradeoff curve will lie above the one we have found. 
Thus, our tradeoff curve identifies design specifications that are not achievable, e.g., 
t^max = 0.4, T max = 1-0, when no limit on actuator effort is made; a fortiori these 
design specifications are not achievable when a limit on actuator effort is included. 

We remark that the tradeoff for this example is considerably more general than 
the reader might suspect. (12.39) holds for 

• any LTI plant with P (l) = 0, 

• the 2-DOF controller configuration, 

• any nonlinear or time- varying controller. 

This is because, no matter how the plant input u is generated, the output of Po, 
y p , must satisfy conditions of the form (12.35-12.36). 



12.5 A Weighted Peak Tracking Error Example 

In this section we present a less trivial example of a plant and family of design 
specifications for which we can explicitly solve the feasibility problem. 
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We consider the classical 1-DOF system of section 2.3.2 with 

s-2 



Po(s) = 

s~ — 1 

Designing a controller for this plant is quite demanding, since it has an unstable 
zero at s = 2 along with an unstable pole only an octave lower, at s = 1. 

Our design specifications will be internal stability and a limit on a weighted 
peak gain of the closed-loop tracking error transfer function: 

||WS||pk^n < ^max, (12.40) 

where 

1 + sT tlk 

and —5 is the closed- loop transfer function from the reference input r to the error 
e = —r + y p (see sections 5.2.5 and 8.1.2). Thus we have a two-parameter family of 
design specifications, indexed by E maK and T tr k- 

Roughly speaking, E max is an approximate limit on the worst case peak mis- 
tracking that can occur with reference inputs that are bounded by one and have a 
bandwidth 1/T tr t. Therefore, 1/T tr t represents a sort of tracking bandwidth for the 
system. It seems intuitively clear, and turns out to be correct, that small E max can 
only be achieved at the cost of large T tr t. 

These design specifications are simple enough that we can explicitly solve the 
feasibility problem for each E max and T tr t. As in the previous section, however, these 
design specifications are not complete enough to guarantee reasonable controller 
designs, so the comments made in the previous section hold here as well. 

As we did for the previous example, we express internal stability in terms of the 
interpolation conditions: S is stable and satisfies 

S{1) =0, 5(2) = 5(oo) = 1. 

Equivalently, WS is stable, and satisfies 

WS{1) = 0, (12.41) 

WS(2) = (l + 2T trk )-\ (12.42) 

lim sWS{s) =T" k 1 . (12.43) 



Let h be the impulse response of WS, so that 

/»oo 

ll^llpk^n = / \h(t)\dt 
JO 
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(from (12.43), h does not contain any impulse at t = 0). We can express the 
interpolation conditions in terms of h as 

y»00 

/ h{t)e-* dt = 0, (12.44) 

Jo 

y»00 

/ h{t)e~ 2t dt={l + 2T trk )-\ (12.45) 

Jo 

fc(0) = T- k 1 . (12.46) 

We will solve the feasibility problem by solving the optimization problem 

min / \h(t)\dt. (12.47) 

subject to (12.44-12.46) ^° 

In chapters 13-15 we will describe general numerical methods for solving an infinite- 
dimensional convex optimization problem such as (12.47); here we will use some 
specific features to analytically determine the solution. We will first guess a solution, 
based on some informal reasoning, and then prove, using only simple arguments, 
that our guess is correct. 

We first note that the third constraint, on h(0), should not affect the minimum, 
since we can always adjust h very near t = to satisfy this constraint, without 
affecting the other two constraints, and only slightly changing the objective. It 
can be shown that the value of the minimum does not change if we ignore this 
constraint, so henceforth we will. 

Now we consider the two integral constraints. Prom the second, we see that h(t) 
will need to be positive over some time interval, and from the first we see that h(t) 
will also have to be negative over some other time interval. Since the integrand 
e~ 2t falls off more rapidly than e - *, it seems that the optimal h(t) should first be 
positive, and later negative, to take advantage of these differing decay rates. Similar 
reasoning finally leads us to guess that a nearly optimal h should satisfy 

h{t) a aS{t) - (36{t - T), (12.48) 

where a and (3 are positive, and T is some appropriate time lapse. The objective is 
then approximately a + f3. 

Given this form, we readily determine that the optimal a, (3, and T are given 
bj 

a =rrk:( 1+ H' (12 ' 49) 

"=irk:( 2+ H' (1M0) 

T = log(l + \/2), (12.51) 
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which corresponds to an objective of 
1 + 22U V T J 



(12.52) 



Our guess that the value of (12.47) is given by (12.52) is correct. To verify this, 
we consider A : R + — > R given by 



\(t) = -(2 + 2v / 2>-* + (3 + 2V2)e~ 2t , 



(12.53) 



and plotted in figure 12.5. This function has a maximum magnitude of one, i.e., 
= 1. 




Figure 12.5 The function X(t) from (12.53). 

Now suppose that h satisfies the two integral equality constraints in (12.47). 
Then by linearity we must have 



[ 'm*)AP~ (3 + 2^2). 
Since |A(£)| < 1 for all £, we have 

y»CO y»CO 

/ h{t)X{t)dt< / \h{t)\dt = \\h\\i. 
Jo Jo 

Combining (12.54) and (12.55), we see that for any h, 



(12.54) 



(12.55) 



y»CO 

/ Ht) 
Jo 



e * dt = and 



y»00 

/ Ht)e 
Jo 



e~ 2t dt = (1 + 2T, 



trk) 
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Ii> 



1 + 2T trk 



(3 + 2V2) , 



i.e., any h that satisfies the constraints in (12.47) has an objective that exceeds the 
objective of our candidate solution (12.52). This proves that our guess is correct. 
(The origin of this mysterious A is explained in the Notes and References.) 

Prom our solution (12.52) of the optimization problem (12.47), we conclude that 
the specifications corresponding to -E max and T tr k are achievable if and only if 



-Emax(l + 2T trk ) > 3 + 2\/2. 



(12.56) 



(We leave the construction of a controller that meets the specifications for £ mai 
and T tr t satisfying (12.56) to the reader.) This region of achievable specifications 
is shown in figure 12.6. 



K) 




0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 

2trk 

Figure 12.6 The tradeoff between peak tracking error and tracking band- 
width specifications. 



Note that to guarantee that the worst case peak tracking error does not exceed 
10%, the weighting filter smoothing time constant must be at least T tr t > 28.64, 
which is much greater than the time constants in the dynamics of P , which are on 
the order of one second. In classical terminology, the tracking bandwidth is consid- 
erably smaller than the open-loop bandwidth. The necessarily poor performance 
implied by the tradeoff curve (12.56) is a quantitative expression that this plant is 
"hard to control". 
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Notes and References 

LQR and LQG-Optimal Controllers 

Standard references on LQR and LQG-optimal controllers are the books by Anderson 
and Moore [AM90], Kwakernaak and Sivan [KS72], Bryson and Ho [BH75], and the 
special issue edited by Athans [Ath71]. Astrom and Wittenmark treat minimum variance 
regulators in [AW90]. The same techniques are readily extended to solve problems that 
involve an exponentially weighted H2 norm; see, e.g., Anderson and Moore [AM69]. 

Multicriterion LQG 

The articles by Toivonen [Toi84] and Toivonen and Makila, [TM89] discuss the multicri- 
terion LQG problem; the latter article has extensive references to other articles on this 
topic. See also Koussoulas and Leondes [KL86]. 

Controllers that Satisfy an Hoc Norm-Bound 

In [Zam81], Zames proposed that the Hk, norm of some appropriate closed-loop trans- 
fer matrix be minimized, although control design specifications that limit the magnitude 
of closed-loop transfer functions appeared much earlier. The state-space solution of sec- 
tion 12.3 is recent, and is due to Doyle, Glover, Khargonekar, and Francis [DGK89, GD88]. 
Previous solutions to the feasibility problem with an Hoo norm-bound on H were consid- 
erably more complex. 

We noted above that the controller K me of section 12.3 not only satisfies the speci- 
fication (12.26); it minimizes the 7-entropy of H. This is discussed in Mustafa and 
Glover [Mus89, MG90, GM89]. The minimum entropy controller was developed inde- 
pendently by Whittle [Whi90], who calls it the linear exponential quadratic Gaussian 
(LEQG) optimal controller. 

Some Other Analytic Solutions 

In [OF85, OF86], O'Young and Francis use Nevanlinna-Pick theory to deduce exact trade- 
off curves that limit the maximum magnitude of the sensitivity transfer function in two 
different frequency bands. 

Some analytic solutions to discrete-time problems involving the peak gain have been found 
by Dahleh and Pearson; see [Vid86, DP87b, DP88b, DP87a, DP88a]. 

About Figure 12.4 

The step response shown in figure 12.4 was found as follows. We let 
20 

T « = X>(r£o)'' (12 - 57) 

i=l 

where x € R 20 is to be determined. (See chapter 15 for an explanation of this Ritz 
approximation.) T(s) must satisfy the condition (12.34). The constraint T(oo) = is 
automatically satisfied; the interpolation condition T(l) = yields the equality constraint 
on x, 

c T x = 0, (12.58) 
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-3.027 
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-3.677 


11 


4.479 


16 


9.641 


2 


7.227 


7 


-9.641 


12 


-9.641 


17 


-1.660 


3 


-9.374 


8 


-3.018 


13 


-9.641 


18 


-9.641 


4 


1.836 


9 


9.641 


14 


-5.682 


19 


4.398 


5 


9.641 


10 


9.641 


15 


9.641 


20 


-0.343 



Table 12.1 The coefficients in the parametrization (12.57) for the step 
response in figure 12.4. 



where Ci = (10/11)*. The undershoot and rise-time specifications are 



y~] XiSi(t) > -0.7 for < t < 1.0, 

=i 

20 

y^XiSiit) > 0.8 for t > 1.0, 



i=l 
20 



(12.59) 
(12.60) 



where Si is the step response of (s/10 + 1) *. By finely discretizing t, (12.59) and (12.60) 
yield (many) linear inequality constraints on x, i.e. 



a^x < bk, k = 1, . . . , L. 



(12.61) 



(12.58) and (12.61) can be solved as a feasibility linear program. The particular coefficients 
that we used, shown in table 12.1, were found by minimizing Ha^oo subject to (12.58) 
and (12.61). 

About the Examples in Sections 12.4 and 12.5 

These two examples can be expressed as infinite- dimensional linear programming problems. 
The references for the next two chapters are relevant; see also Luenberger [Lue69], Rock- 
afellar [Roc74, Roc82], Reiland [Rei80], Anderson and Philpott [AP84], and Anderson 
and Nash [AN87]. 

We solved the problem (12.47) (ignoring the third equality constraint) by first solving its 
dual problem, which is 



max 

-t i \ „-2t| 



A 2 (l + 2T trk )- 1 . 



(12.62) 



||Aic-* + Aae-'loo <1 



This is a convex optimization problem in R 2 , which is readily solved. The mysterious A(t) 
that we used corresponds exactly to the optimum Ai and A2 for this dual problem. 

This dual problem is sometimes called a semi-infinite optimization problem since the con- 
straint involves a "continuum" of inequalities (i.e., |Aie~* + A2e _2t | < 1 for each t > 0). 
Special algorithms have been developed for these problems; see for example the surveys 
by Polak [Pol83], Polak, Mayne, and Stimler [PMS84], and Hettich [Het78]. 



Chapter 13 

Elements of Convex Analysis 



We describe some of the basic tools of convex nondifferentiable analysis: sub- 
gradients, directional derivatives, and supporting hyperplanes, emphasizing their 
geometric interpretations. We show how to compute supporting hyperplanes and 
subgradients for the various specifications and functionals described in previous 
chapters. 

Many of the specifications and functionals that we have encountered in chapters 8- 
10 are not smooth — the specifications can have "sharp corners" and the functionals 
need not be differentiable. Fortunately, for convex sets and functionals, some of 
the most important analytical tools do not depend on smoothness. In this chapter 
we study these tools. Perhaps more importantly, there are simple and effective 
algorithms for convex optimization that do not require smooth constraints or dif- 
ferentiable objectives. We will study some of these algorithms in the next chapter. 

13.1 Subgradients 

If <j> : R n — > R is convex and differentiable, we have 

4>{z) > 4>{x) + V<f){x) T {z - x) for all z. (13.1) 

This means that the plane tangent to the graph of <j> at x always lies below the 
graph of <j>. If <j> : R n —* R is convex, but not necessarily differentiable, we will say 
that g 6 R n is a subgradient of <f> at x if 

4>{z) > <f){x)+g T {z-x) for all z. (13.2) 

From (13.1), the gradient of a differentiable convex function is always a subgradient. 
A basic result of convex analysis is that every convex function always has at least 
one subgradient at every point. We will denote the set of all subgradients of <j> at x 
as d(f)(x), the subdifferential of <j> at x. 
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We can think of the right-hand side of (13.2) as an affine approximation to (f)(z), 
which is exact at z = x. The inequality (13.2) states that the right-hand side is a 
global lower bound on <f). This is shown in figure 13.1. 




slope g 2 



slope g 2 



Figure 13.1 A convex function on R along with three affine global lower 
bounds on <j) derived from subgradients. At x\, <j> is differentiate, and the 
slope of the tangent line is g\ = <p'(x\). At x 2 , <p ls n °t differentiate; two 
different tangent lines, corresponding to subgradients g 2 and g 2 , are shown. 

We mention two important consequences of g € d(f>(x). For g T (z — x) > we 
have <f)(z) > <f)(x), in other words, in the half-space {z \ g T (z — x) > 0}, the values 
of (f) exceed the value of <f) at x. Thus if we are searching for an x* that minimizes 
<f), and we know a subgradient g of <f) at x, then we can rule out the entire half-space 
g T (z — x) > 0. The hyperplane g T (z — x) = is called a cut because it cuts off 
from consideration the half-space g T (z — x) > in a search for a minimizer. This 
is shown in figure 13.2. 

An extension of this idea will also be useful. From (13.2), every z that satisfies 
4>{z) < a, where a < <f)(x), must also satisfy g T (z — x) < a — <f)(x). If we are searching 
for a z that satisfies <f)(z) < a, we need not consider the half-space g T (z — x) > 
a — <f)(x). The hyperplane g T (z — x) = a — <f>(x) is called a deep-cut because it rules 
out a larger set than the simple cut g T (z — x) = 0. This is shown in figure 13.3. 



13.1.1 Subgradients: Infinite-Dimensional Case 

The notion of a subgradient can be generalized to apply to functionals on infinite- 
dimensional spaces. The books cited in the Notes and References at the end of this 
chapter contain a detailed and precise treatment of this topic; in this book we will 
give a simple (but correct) description. 
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T T 

g z = g x 



rp rp 

g z < g x 



Figure 13.2 A point x and a subgradient g of <f> at x. In the half-space 
g T z > g T x, 4>{z) exceeds <f>{x)\ in particular, any minimizer x* of must lie 
in the half-space g T z < g T x. 




Figure 13.3 A point x and a subgradient g of <f> at x determine a deep-cut 
in the search for points that satisfy <f>(z) < a (assuming x does not satisfy 
this inequality). The points in the shaded region need not be considered 
since they all have <f>(z) > a. 
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If <j> is a convex functional on a (possibly infinite-dimensional) vector space V, 
then we say (f> sg is a subgradient for <j) at v G V if 4> ss is a linear functional on V, 
and we have 

4>(z) > <f>{v) + <f) se (z - v) for all z G V. (13.3) 

The subdifferential d<f)(v) consists of all subgradients of <f> at v; note that it is a set 
of linear functionals on V. 

If V = R n , then every linear functional on V has the form g T z for some vector 
g G R n , and our two definitions of subgradient are therefore the same, provided we 
ignore the distinction between the vector g G R n and the linear functional on R n 
given by the inner product with g. 

13.1.2 Quasigradients 

For quasiconvex functions, there is a concept analogous to the subgradient. Suppose 
<j> : R n — ► R is quasiconvex, which we recall from section 6.2.2 means that 

<t>(\x + (1 - X)x) < max{<f>(x), <j>(x)} for all < A < 1, x, x G R n . 

We say that g is a quasigradient for <j> at x if 

<t>{ z ) > 4>{ x ) whenever g T (z - x) > 0. (13-4) 

This simply means that the hyperplane g T (z — x) = forms a simple cut for <f>, 
exactly as in figure 13.2: if we are searching for a minimizer of <j), we can rule out 
the half-space g T (z — x) > 0. 

If is differentiable and V</>(a;) ^ 0, then V<f)(x) is a quasigradient; if is convex, 
then (13.2) shows that any subgradient is also a quasigradient. It can be shown 
that every quasiconvex function has at least one quasigradient at every point. Note 
that the length of a quasigradient is irrelevant (for our purposes): all that matters 
is its direction, or equivalently, the cutting-plane for <j> that it determines. 

Any algorithm for convex optimization that uses only the cutting-planes that 
are determined by subgradients will also work for quasiconvex functions, if we sub- 
stitute quasigradients for subgradients. It is not possible to form any deep-cut for 
a quasiconvex function. 

In the infinite-dimensional case, we will say that a linear functional <f) qs on V is 
a quasigradient for the quasiconvex functional <j> at v G V if 

<f)(z) > (f)(v) whenever (f> qs (z — v) > 0. 

As discussed above, this agrees with our definition above for V = R n , provided we 
do not distinguish between vectors and the associated inner product linear func- 
tionals. 
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13.1.3 Subgradients and Directional Derivatives 

In this section we briefly discuss the directional derivative, a concept of differential 
calculus that is more familiar than the subgradient. We will not use this concept in 
the optimization algorithms we present in the next chapter; we mention it because 
it is used in descent methods, the most common algorithms for optimization. 
We define the directional derivative of <j> at x in the direction 6x as 

,. .a .. <t>{x + h6x) - <f>(x) 

d> (x: ox) = hm 

h\o h 

(the notation h \ means that h converges to from above). It can be shown 
that for convex <j> this limit always exists. Of course, if <j> is differentiable at x, then 

<t>'{x; 8x) = V(f)(x) T 8x. 

We say that 6x is a descent direction for <j> at x if (f)'(x; 6x) < 0. 

The directional derivative tells us how <j> changes if x is moved slightly in the 
direction 6x, since for small h, 

4>[x + h—- « (f)(x) +h- 



\\Sx\\J ^ v ; \\6x\\ ' 

The steepest descent direction of <j> at x is defined as 

6x S £ = argmin </>'(£; Sx). 
11**11=1 

In general the directional derivatives, descent directions, and the steepest descent 
direction of <j> at x can be described in terms of the subdifferential at x (see the 
Notes and References at the end of the chapter). In many cases it is considerably 
more difficult to find a descent direction or the steepest descent direction of <j> at x 
than a single subgradient of <j> at x. 

If <j> is differentiable at x, and V<f)(x) ^ 0, then — V</>(a;) is a descent direction 
for <j) a t x. It is not true, however, that the negative of any nonzero subgradient 
provides a descent direction: we can have g 6 d<f)(x), g ^ 0, but —g not a descent 
direction for <j> at x. As an example, the level curves of a convex function <j> are 
shown in figure 13.4(a), together with a point x and a nonzero subgradient g. Note 
that <f> increases for any movement along the directions ±g, so, in particular, —g 
is not a descent direction. Negatives of the subgradients at non-optimal points 
are, however, descent directions for the distance to a (or any) minimizer, i.e., if 
4>{x*) = (f>* and ip{z) = \\z — x*\\, then ip'{x\ —g) < for any g 6 d(f>{x). Thus, 
moving slightly in the direction —g will decrease the distance to (any) minimizer 
a;*, as shown in figure 13.4(b). 

A consequence of these properties is that the optimization algorithms described 
in the next chapter do not necessarily generate sequences of decreasing functional 
values (as would a descent method). 
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(a) (b) 

Figure 13.4 A point x and a subgradient g of <f> at x is shown in (a), 
together with three level curves of <j). Note that — g is not a descent direction 
for <j) at x: <p increases for any movement from x in the directions ±g. 
However, — g is a descent direction for the distance to any minimizer. In (b) 
the level curves for the distance ip(z) = \\z — x*\\ are shown; — g points into 
the circle through x. 

13.2 Supporting Hyperplanes 

If C is a convex subset of R n and a; is a point on its boundary, then we say that the 
hyperplane through x with normal g, {z \ g T (z — x) = 0}, is a supporting hyperplane 
to C at x if C is contained in the half-space g T (z — x) < 0. Roughly speaking, if 
the set C is "smooth" at x, then the plane that is tangent to C at x is a supporting 
hyperplane, and g is its outward normal at x, as shown in figure 13.5. But the notion 
of supporting hyperplane makes sense even when the set C is not "smooth" at x. A 
basic result of convex analysis is that there is at least one supporting hyperplane 
at every boundary point of a convex set. 

If C has the form of a functional inequality, 

C = {z\ 4>(z) < a}, 

where <f) is convex (or quasiconvex), then a supporting hyperplane to C at a boundary 
point x is simply g T (z — x) = 0, where g is any subgradient (or quasigradient). 

If C is a convex subset of the infinite-dimensional space V and a; is a point of its 
boundary, then we say that the hyperplane 

{z \<f> sh (z-x) = 0}, 

where </> sh is nonzero linear functional on V, is a supporting hyperplane for C at 
point x if 



4> sil {z - x) < for all z 6 C. 
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Figure 13.5 A point x on the boundary of a convex set C. A supporting 
hyperplane g T (z — x) = for C at x is shown: C lies entirely in the half-space 
g T (z -x)<0. 



Again we note that this general definition agrees with the one above for V = R n 
if we do not distinguish between vectors in R n and their associated inner product 
functionals. 



13.3 Tools for Computing Subgradients 

To use the algorithms that we will describe in the next two chapters we must be able 
to evaluate convex functionals and find at least one subgradient at any point. In 
this section, we list some useful tools for subgradient evaluation. Roughly speaking, 
if one can evaluate a convex functional at a point, then it is usually not much more 
trouble to determine a subgradient at that point. 

These tools come from more general results that describe all subgradients of, for 
example, the sum or maximum of convex functionals. These results can be found 
in any of the references mentioned in the Notes and References at the end of this 
chapter. The more general results, however, are much more than we need, since 
our purpose is to show how to calculate one subgradient of a convex functional 
at a point, not all subgradients at a point, a task which in many cases is very 
difficult, and in any case not necessary for the algorithms we describe in the next 
two chapters. The more general results have many more technical conditions. 

• Differentiable functional: If is convex and differentiable at x, then its deriva- 
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tive at x is an element of d(f>(x). (In fact, it is the only element of d(f>(x).) 

• Scaling: If w > and <f> is convex, then a subgradient of w<f) at x is given by 
wg, where g is any subgradient of <f) at x. 

• Sum: If (f>(x) = <fii(x) + • • • + (f> m (x), where <f>i, . . . , (f) m are convex, then any 
g of the form g = gi + ••• + g m is in d(f>(x), where gi 6 d<fii(x). 

• Maximum: Suppose that 

<f)(x) = sup{</> a (a;) | a e A} , 

where each <f) a is convex, and A is any index set. Suppose that a ac h 6 A is 
such that <f> aeiCb { x ) = <t>{ x ) (so that ^ aach (s) achieves the maximum). Then if 
g € 9(/> aach (a;), we have 5 6 d<f>(x). Of course there may be several different 
indices that achieve the maximum; we need only pick one. 

A special case is when <f) is the maximum of the functionals <f>\, . . . , <f) n , so that 
A = {l,...,n}. If (f)(x) = <fii(x), then any subgradient g of (f>i(x) is also a 
subgradient of 4>(x). 

Prom these tools we can derive additional tools for determining a subgradient of 
a weighted sum or weighted maximum of convex functionals. Their use will become 
clear in the next section. 

For quasiconvex functionals, we have the analogous tools: 

• Differentiable functional: If <j> is quasiconvex and differentiable at x, with 
nonzero derivative, then its derivative at a; is a quasigradient of <j> at x. 

• Scaling: If w > and <j> is quasiconvex, then any quasigradient of <j> at x is 
also a quasigradient of w<f) at x. 

• Maximum: Suppose that 

<t>{x) = sup {4> a {x) I a e A} , 

where each <f) a is quasiconvex, and A is any index set. Suppose that a a ch £ A 
is such that </>a ach (^) = <f>{x)- Then if g is a quasigradient of (f>a acb at x, then 
g is a quasigradient of <j> at x. 

• Nested family: Suppose that <j> is defined in terms of a nested family of convex 
sets, i.e., (f){x) = inf {a \ x G C a }, where C a C C 13 whenever a < (3 (see 
section 6.2.2). If g T (z — x) = defines a supporting hyperplane to C^ x ^ at x, 
then g is a quasigradient of <j> at x. 

(The sum tool is not applicable because the sum of quasiconvex functionals need 
not be quasiconvex.) 
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13.4 Computing Subgradients 



In this section we show how to compute subgradients of several of the convex func- 
tional we have encountered in chapters 8-10. Since these are convex functionals 
on 7{, an infinite-dimensional space, the subgradients we derive will be linear func- 
tionals on 7i. In the next section we show how these can be used to calculate 
subgradients in R n when a finite-dimensional approximation used in chapter 15 is 
made; the algorithms of the next chapter can then be used. 

In general, the convex functionals we consider will be functionals of some par- 
ticular entry (or block of entries) of the closed-loop transfer matrix H. To simplify 
notation, we will assume in each subsection that H consists of only the relevant 
entry or entries. 

13.4.1 An RMS Response 

We consider the weighted H 2 norm, 

1/2 



4>{H) = (± y°° S w (w)\H(ju>)\ 2 dw 



with SISO H for simplicity (and of course, S w (tv) > 0). We will determine a 
subgradient of <j> at the transfer function Hq. If <j>{Ho) = 0, then the zero functional 
is a subgradient, so we now assume that <p[H ) ^ 0. In this case <f> is differentiable 
at H , so our first rule above tells us that our only choice for a subgradient is the 
derivative of <j> at H , which is the linear functional </> sg given by 



<t> ss (H) = 2 ^ [Hq) J°° S w (w)R (H (jw)H(jwj) dw. 



(The reader can verify that for small H, (f)(H +H) » <f>{H ) +<f) sg (H); the Cauchy- 
Schwarz inequality can be used to directly verify that the subgradient inequal- 
ity (13.3) holds.) 

Using the subgradient for <j>, we can find a supporting hyperplane to the maxi- 
mum RMS response specification <f)(H) < a. 

There is an analogous expression for the case when H is a transfer matrix. For 
4>(H) = \\H\\ 2 and H ^ 0, a subgradient of <j> at H is given by 



<f> ss (H) = 2 ^ {Ho) J^XTr (H (ju,yH(ju,)) dw. 



13.4.2 Step Response Overshoot 

We consider the overshoot functional, 
</>{H) =sups(*) - 1, 

t>0 
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where s is the unit step response of the transfer function H. We will determine a 
subgradient at Hq. The unit step response of Ho will be denoted so- 

We will use the rule that involves a maximum of a family of convex functionals. 
For each t > 0, we define a functional (/> step '* as follows: ste P-*(iI) = s(t). The 
functional ste P'* evaluates the step response of its argument at the time t; it is a 
linear functional, since we can express it as 

°° pjut 



i r°° pjut 

0S tep, t(H) / _ H{ju)du 

27r J-oo JV 



Note that we can express the overshoot functional <f> as the maximum of the 
affine functionals ^ ste P'* — 1: 

4>(H) =su P (/> step '*(#)-l. 
t>o 

Now we apply our last rule. Let to > denote any time such that the overshoot 
is achieved, that is, (f)(Ho) = so(*o) — 1- There may be several instants at which 
the overshoot is achieved; to can be any of them. (We ignore the pathological case 
where the overshoot is not achieved, but only approached as a limit, although it is 
possible to determine a subgradient in this case as well.) 

Using our last rule, we find that any subgradient of the functional (fi ste P' to — 1 
is a subgradient of <j> at Ho- But the functional ^> ste P' <0 — 1 is affine; its derivative 
is just (p ste P' to . Hence we have determined that the linear functional ^> ste P' <0 is a 
subgradient of <j> at Ho- 

Let us verify the basic subgradient inequality (13.3). It is 

4>{H) > 4>{H ) + <f> step > t0 {H - Ho). 

Using linearity of </> ste P>*° and the fact that <f>{H ) = s {t ) - 1 = (f) step ' to {H ) - 1, 
the subgradient inequality is 

4>(H) > s(t ) - 1. 

Of course, this is obvious: it states that for any transfer function, the overshoot is 
at least as large as the value of the unit step response at the particular time t , 
minus 1. 

A subgradient of other functionals involving the maximum of a time domain 
quantity, e.g., maximum envelope violation (see section 8.1.1), can be computed in 
a similar way. 

13.4.3 Quasigradient for Settling Time 

Suppose that <f) is the settling-time functional, defined in section 8.1.1: 
<p(H) = inf{T | 0.95 < s{t) < 1.05 for t > T} 
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We now determine a quasigradient for <j> at the transfer function Ho- Let To = 
(f)(Ho), the settling time of Ho- so(T)) is either 0.95 or 1.05. Suppose first that 
s (T ) = 1.05. We now observe that any transfer function with unit step response 
at time T greater than or equal to 1.05, must have a settling time greater than or 
equal to T , in other words, 

4>{H) > T whenever s{T ) > 1.05. 

Using the step response evaluation functionals introduced above, we can express 
this observation as 

<t>{H) > 4>{H ) whenever (f) step > T °(H - H ) > 0. 

But this shows that the nonzero linear functional ^> ste P' T ° is a quasigradient for <f) 
at Ho- 

In general we have the quasigradient <f) qg for <f) at Ho, where 

, qg _ / </> step ' T ° if so(To) = 1.05, 
9 \ -</> ste P> T ° if So ( To ) = 0.95, 

and To = <t>{H ). 

13 .4.4 Maximum Magnitude of a Transfer Function 

We first consider the case of SISO H. Suppose that 
4>{H) = ||tf ||oc = sup \H(jw)\, 

provided H is stable (see section (5.2.6)). (We leave to the reader the modification 
necessary if is a weighted Hqo norm.) We will determine a subgradient of <f) at 
the stable transfer function Ho ^ 0. 

For each w € R, consider the functional that evaluates the magnitude of its 
argument (a transfer function) at the frequency jw. 

<T* S '"(H) = \H(jw)\. 

These functionals are convex, and we can express the maximum magnitude norm 
as 

4>{H) = sup 4> mas ' w {H). 

Thus we can use our maximum tool to find a subgradient. 

Suppose that wo 6 R is any frequency such that \Ho(jcuo)\ = <f>{Ho)- (We 
ignore the pathological case where the supremum is only approached as a limit. In 
this case it is still possible to determine a subgradient.) Then any subgradient of 
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^mag,a> a t jj o j s a subgradient of <f) at Ho- But since Ho(jcuo) ^ 0, this functional 
is differentiable at Ho, with derivative 



<t?*{H) = -t^-K (H (jw )H(jw j) 



<t>{Ho 

This linear functional is a subgradient of <f> at Ho- The reader can directly verify 
that the subgradient inequality (13.3) holds. 



13.4.5 Hoc Norm of a Transfer Matrix 

Now suppose that H is an m x p transfer matrix, and <j> is the Hqo norm: 

4>(H) = Halloo. 

We will express <j> directly as the maximum of a set of linear functionals, as follows. 
For each w 6 R, u 6 C m , and v € C p , we define the linear functional 

<j> u ' v '"(H) =&{u*H{jw)v). 
Then we have 

<f>(H) =sui>{(t) u > v > u '(H) | w e R, ||u|| = ||v|| = 1}, 
using the fact that for any matrix A € C mxp , 

(Tmajfi) = SUp {0?(u* Av) \ \\u\\ = \\v\\ = 1}. 

Now we can determine a subgradient of <j> at the transfer matrix Hq. We pick 
any frequency wo 6 R at which the Hqo norm of Ho is achieved, i. e. 

O"max(-ffo(iw )) = ||-ffo||oo- 

(Again, we ignore the case where there is no such cu , commenting that for rational 
H , there always is such a frequency, if we allow w = oo.) We now compute a 
singular value decomposition of H (jw ): 

H (jw ) = UHV*. 

Let uq be the first column of U, and let vo be the first column of V. A subgradient 
of <f> at Ho is given by the linear functional 
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13.4.6 Peak Gain 

We consider the peak gain functional 



/»oo 

4>(H) = \\H\\ p ^ n = / \h(t)\ 
Jo 



dt. 



In this case our functional is an integral of a family of convex functionals. We will 
guess a subgradient of <f) at the transfer function Ho, reasoning by analogy with 
the sum rule above, and then verify that our guess is indeed a subgradient. The 
technique of the next section shows an alternate method by which we could derive 
a subgradient of the peak gain functional. 

Let h denote the impulse response oiH . For each t > we define the functional 
that gives the absolute value of the impulse response of the argument at time t: 

abs - h >*(ff) = \h{t)\. 

These functionals are convex, and we can express </> as 

/»oo 

4>{H) = / < /. abs - h ' t (i?)dt. 
Jo 

If we think of this integral as a generalized sum, then from our sum rule we might 
suspect that the linear functional 



y»00 

<f) ss {H) = / </> sg '*(.ff)<ft 
Jo 



is a subgradient for <f), where for each t, <f) s%,t is a subgradient of ( ^ abs - h ' t at H®. 
Now, these functionals are differentiable for those t such that h (t) ^ 0, and is a 
subgradient of (f) ahs - h ' t at H for those t such that h (t) = 0. Hence a specific choice 
for our guess is 

ytOO 

4> SS {H)= / sgn{h {t))h{t)dt. 
Jo 

We will verify that this is a subgradient of <f) at H . 

For each t and any h we have \h(t)\ > sgn(ho(t))h(t); hence 

y»00 y»00 

(j>{H) = / \h(t)\dt> I sgn{h {t))h{t) dt. 
Jo Jo 

This can be rewritten as 

ytOO 

4>{H) > / (\h (t)\ + sgn(h (t))(h(t) - h (t))) dt = (f>(H ) + ^(H - H ). 
Jo 

This verifies that <f) sg is a subgradient of <f> at Ho- 
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13.4.7 A Worst Case Norm 

We consider the particular worst case norm described in section 5.1.3: 

<t>{H) = \\H\\ WC = SUp {||-ffu||oo | IMIoc < M amph Halloo < M s iew} . 

We first rewrite <f) as 

<f>(H) = sup I / v{t)h{t) dt Halloo < M amp i, HvHoo < M s i ew \ . (13.5) 

Now for each signal v we define the linear functional 

rtOO 

4> V {H) = / v{t)h{t)dt. 
Jo 

We can express the worst case norm as a maximum of a set of these linear func- 

tionals: 

(f>{H) = SUp {</>"(#) | IMIoo < M amp i, HvHoo < M s ie w }. 

We proceed as follows to find a subgradient of <f> at the transfer matrix Ho- Find 
a signal vo such that 



\\ v o\ 



/»00 

< Mampi, ||*o||oo < M s i ew , / V (t)h (t) dt = <f>(H ) 

Jo 



(It can be shown that in this case there always is such a v ; some methods for finding 
v are described in the Notes and References for chapter 5.) Then a subgradient of 
<j> at Ho is given by 

<f s (#) =4> V °{H). 

The same procedure works for any worst case norm: first, find a worst case 
input signal uq such that ||.ffo||wc = ||-ffoUo||output- This task must usually be done 
to evaluate ||i?o||wc anyway. Now find any subgradient of the convex functional 
(f> u °(H) = ||-ffuo||output; it will be a subgradient of || • || wc at Ho- 

13.4.8 Subgradient for the Negative Dual Function 

In this section we show how to find a subgradient for — ip at A, where t[> is the dual 
function introduced in section 3.6.2 and discussed in section 6.6. Recall from (6.8) of 
section 6.6 that — 1[> can be expressed as the maximum of a family of linear functions 
of A; we can therefore use the maximum tool to find a subgradient. 
We start by finding any -ff a ch such that 

V>(A) = Ai0i(# ach ) + • • • + A L< /. L (i? ach ). 

Then a subgradient of — 1[> at A is given by 

4>l{H ac h) 
</>Z,(-ffach) 
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13.5 Subgradients on a Finite-Dimensional Subspace 

In the previous section we determined subgradients for many of the convex func- 
tional we encountered in chapters 8-10. These subgradients are linear functionals 
on the infinite- dimensional space of transfer matrices; most numerical computation 
will be done on finite-dimensional subspaces of transfer matrices (as we will see in 
chapter 15). In this section we show how the subgradients computed above can be 
used to calculate subgradients on finite-dimensional subspaces of transfer matrices. 
Suppose that we have fixed transfer matrices H ,Hx, . . . ,-Hjv, and <j> is some 
convex functional on transfer matrices. We consider the convex function ip : R — ► 
R given by 

<p(x) = <p(H + x x E. x H h x N H N ). 

To determine some g 6 d<p(x), we find a subgradient of <f> at the transfer matrix 
H + X\H\ + • • • + x N H N , say, <f) sg . Then 

" </> sg (#i) " 
g= : ed(p(x). 

_ 4>^{H N ) _ 

Let us give a specific example using our standard plant of section 2.4. Consider 
the weighted peak tracking error functional of section 11.1.2, 



V P k_trk(a, P) = \W (aH$ + 0H™ + (1 - a - 0)h[$ - l) 



pk_gn 



where 



W = 



0.5 



s + 0.5' 



-44.1s 3 + 334s 2 + 1034s + 390 



13 s 6 + 20s 5 + 155s 4 + 586s 3 + 1115s 2 + 1034s + 390 ' 
„(b) -220s 3 + 222s 2 + 19015s + 7245 



'13 



s 6 + 29.1s 5 + 297s 4 + 1805s 3 + 9882s 2 + 19015s + 7245 ' 

( C ) -95.1s 3 - 24.5s 2 + 9505s + 2449 

13 ~ s 6 + 33.9s 5 + 425s 4 + 2588s 3 + 8224s 2 + 9505s + 2449 ' 
Vpk_trk has the form 

V P k_trk(a, P) = ||-ffo + a#i + PH 2 \\ pk _ ga , 
where 

H = W (h {c) - l\ , 

H 1= W (i? (a) - i? (c) ) , 

H 2 =W (i? (b) - # (c) ) . 
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The level curves of y p k_trk are shown in figure 11.4. A subgradient g € 9^ p k_trk(a, (3) 
is given by 



9 = 



4> se (H 2 ) 



/»oo 

/ sgn(h(t))hi(t) dt 
Jo 

/»oo 

/ sgn{h{t))h 2 {t) dt 
Jo 



where h is the impulse response of H + aH\ + (3H 2 , hi is the impulse response of 
Hi, and h 2 is the impulse response of H 2 (see section 13.4.6). 

Consider the point a = 1, (3 = 0, where V'pk.trkll) 0) = 0.837. A subgradient at 
this point is 



9 = 



0.168 
-0.309 



(13.6) 



In figure 13.6 the level curve V'pk_trk(a,/3) = 0.837 is shown, together with the 
subgradient (13.6) at the point [1 0] T . As expected, the subgradient determines a 
half-space that contains the convex set 

{ [a 0\ T | v> P k_trk(a, 0) < V P k_trk(l, 0) = 0.837} . 



QCl 0.5 




Figure 13.6 The level curve ^ p k_trk(ct, /3) = 0.837 is shown, together with 
the subgradient (13.6) at the point [1 0] T . 
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Notes and References 

Convex Analysis 

Rockafellar's book [Roc70] covers convex analysis in detail. Other texts covering this 
material are Stoer and Witzgall [SW70], Barbu and Precupanu [BP78], Aubin and Vin- 
ter [AV82], and Demyanov and Vasilev [DV85]. The last three consider the infinite- 
dimensional (Banach space) case, carefully distinguishing between the many important dif- 
ferent types of continuity, compactness, and so on, which we have not considered. Daniel's 
book [Dan71] also gives the precise infinite-dimensional formulation. 

General nonsmooth analysis (which includes convex analysis) is covered in Clarke [Cla83], 
which gives the precise formulation of the concepts of this chapter in infinite-dimensional 
(Banach) spaces. Chapter 2 of Clarke's book contains a complete calculus for subgradients 
and quasigradients, including the subgradient computation tools of section 13.3 (and a lot 
more). 

Subgradients of Closed-Loop Convex Functionals 

Subgradients of an Hoo norm are given in Polak and Wardi [PW82]; several of the 
other subgradients are derived in Polak and Salcudean [PS89] and Salcudean's Ph.D. 
thesis [Sal86]. 
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Chapter 14 

Special Algorithms for Convex 
Optimization 



We describe several simple but powerful algorithms that are specifically designed 
for convex optimization. These algorithms require only the ability to compute 
the function value and any subgradient for each relevant function. A key feature 
of these algorithms is that they maintain converging upper and lower bounds on 
the quantity being computed, and thus can compute the quantity to a guaranteed 
accuracy. We demonstrate each on the two- parameter example of chapter 11; in 
chapter 15 they are applied to more substantial problems. 

In this chapter we concentrate on the finite- dimensional case; these methods are 
extended to the infinite- dimensional case in the next chapter. 

14.1 Notation and Problem Definitions 

We will consider several specific forms of optimization problems. 

The unconstrained problem is to compute the minimum value of <j>, 

<f>* = mm <f>{z), (14.1) 

and in addition to compute a minimizer x* , which satisfies <f)(x*) = <j>* . We will use 
the notation 

x* = argmin(/)(z) 

to mean that x* is some minimizer of (f). 

The constrained optimization problem is to compute the minimum value of <j>, 
subject to the constraints ipi(z) < 0, . . . , tp m {z) < 0, i.e., 

d>* = min 4>( z ): 

V» 1 (z)<0,..,V» m (z)<0 
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and in addition to compute a minimizer x* that satisfies <f)(x*) = <j>*, tpi(x*) < 0, 
. . . , ip m {x*) < 0. To simplify notation we define the constraint function 

ip(z) = max ipi(z), 

l<i<m 

so we can express the constraints as tp(z) < 0: 

6* = min 6(z). (14.2) 

V>(z)<0 

We will say that z is feasible if tp(z) < 0. 
The feasibility problem is: 

find x such that tp(x) < 0, or determine that there is no such x. (14.3) 

Algorithms that are designed for one form of the convex optimization problem 
can often be modified or adapted for others; we will see several examples of this. 
It is also possible to modify the algorithms we present to directly solve other prob- 
lems that we do not consider, e.g., to find Pareto optimal points or to do goal 
programming. 

Throughout this chapter, unless otherwise stated, <j> and ipi,..., t[> m are convex 
functions from R n into R. The constraint function t[> is then also convex. 

14.2 On Algorithms for Convex Optimization 

Many algorithms for convex optimization have been devised, and we could not hope 
to survey them here. Instead, we give a more detailed description of two types of 
algorithms that are specifically designed for convex problems: cutting-plane and 
ellipsoid algorithms. 

Let us briefly mention another large family of algorithms, the descent meth- 
ods, contrasting them with the algorithms that we will describe. In these methods, 
successive iterations produce points that have decreasing objective values. General- 
purpose descent methods have been successfully applied to, and adapted for, non- 
differentiable convex optimization; see the Notes and References at the end of this 
chapter. The cutting-plane and ellipsoid algorithms are not descent methods; ob- 
jective values often increase after an iteration. 

Possible advantages and disadvantages of some of the descent methods over 
cutting-plane and ellipsoid methods are: 

• The cutting- plane and ellipsoid algorithms require only the evaluation of func- 
tion values and any one (of possibly many) subgradients of functions. Most 
(but not all) descent methods require the computation of a descent direction 
or even steepest descent direction for the function at a point; this can be 
a difficult task in itself, and is always at least as difficult as computing a 
subgradient. 
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• Many (but not all) descent methods for convex optimization retain the heuris- 
tic stopping criteria used when they are applied to general (nonconvex) opti- 
mization problems. In contrast we will see that the cutting-plane and ellipsoid 
methods have simple stopping criteria that guarantee the optimum has been 
found to a known accuracy. 

• Most descent methods for nondifferentiable optimization are substantially 
more complicated than the cutting-plane and ellipsoid algorithms. These 
methods often include parameters that need to be adjusted for the partic- 
ular problem. 

• For smooth problems, many of the descent methods offer substantially faster 
convergence, e.g., quadratic. 

Let us immediately qualify the foregoing remarks. Many descent methods have 
worked very well in practice: one famous example is the simplex method for linear 
programming. In addition, it is neither possible nor profitable to draw a sharp 
line between descent methods and non-descent methods. For example, the ellipsoid 
algorithm can be interpreted as a type of variable-metric descent algorithm. We 
refer the reader to the references at the end of this chapter. 

14.3 Cutting-Plane Algorithms 

14.3.1 Computing a Lower Bound on <p* 

We consider the unconstrained problem (14.1). Suppose we have computed function 
values and at least one subgradient at xi, . . . , x^: 

0(xi),...,0(z fe ), gi G d<j)(x{),...,g k e d(f>{x k ). (I 4 - 4 ) 

Each of these function values and subgradients yields an affine lower bound on <f>: 

(f>{z) > <p(xi) + g i (z — Xi) for all z, 1 < i < k, 
and hence 

4>(z) > <!%(*) = max (<f,( Xi ) + gf(z - Xi j) . (14.5) 

(/>k b is a piecewise linear convex function that is everywhere less than or equal to <f). 
Moreover, this global lower bound function is tight at the points xi, . . . , x^, since 
(f)(xi) = <f)^(xi) for 1 < i < k. In fact, <$£ is the smallest convex function that 
has the function values and subgradients given in (14.4). An example is shown in 
figure 14.1. 

It follows that 

<j>* >L k =min</> 1 fe b (z). (14.6) 
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slope g 3 




x 3 x 2 



Figure 14.1 The maximum of the affine support functionals at xi, X2, and 
X3 gives the global lower bound function 0' 3 b , which is tight at the points 
xi, X2, and X3. L3, the minimum of 4>\ h , which occurs at xl, is found by 
solving (14.7). L3 is a lower bound on <f>* . 



The minimization problem on the right-hand side is readily solved via linear pro- 
gramming. We can express (14.6) as 



Lk = min L, 

L, z 

<P(xi) + gf(z - Xi) <L, 1 < i < k 

which has the form of a linear program in the variable w: 

Lk = min c T w 
Aw < b 



(14.7) 



where 



w = 



z 
L 



c = 



, A = 



T 

91 


-1 " 


, b = 


r t 


~<t>{xi) 


T 

9l 


-1 




. 9k x k ~ 


- 4>{xk) 



(14.8) 



In fact, this linear program determines not only Lk, but also a minimizer of </>^ b , 
which we denote x^ (shown in figure 14.1 as well). 

The idea behind (14.6) is elementary, but it is very important. It shows that 
by knowing only a finite number of function values and subgradients of a convex 
function, we can deduce a lower bound on the minimum of the function. No such 
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property holds for general nonconvex functions. This property will be useful in de- 
signing stopping criteria for optimization algorithms that can guarantee a maximum 
error. 

The function <f) k h can be unbounded below, so that L^ = —oo (e.g. when k = 
1 and <7i ^ 0), which is not a useful bound. This can be avoided by explicitly 
specifying bounds on the variables, so that we consider 

4>* = min <j>(z). 

In this case we have the lower bound 
L k = min </> k b (z) 

which can be computed by adding the bound inequalities z m i n < z < z max to the 
linear program (14.7). 

Finally we mention that having computed <f) at the points xi,...,x k we have 
the simple upper bound on <f)*: 

U k = min 4>{xi), (14.9) 

l<i<fe 

which is the lowest objective value so far encountered. If an objective value and a 
subgradient are evaluated at another point Xk+i, the new lower and upper bounds 
Lk+i and Uk+i are improved: 

Lk < L k +i <</>*< Uk+i < U k - 



14.3.2 Kelley's Cutting-Plane Algorithm 

Kelley's cutting-plane algorithm is a natural extension of the lower bound compu- 
tation of the previous section. It is simply: 

xi, z m i n , z max < — any initial box that contains minimizers; 

k^O; 

repeat { 

k <- k + 1; 

compute (f)(xk) and any gk € d<f)(xk); 

solve (14-7) t° find x^ and Lk; 

compute Uk using (14-9); 

x k +i «- x* k ; 
} until (U k - L k < e ); 

An example of the second iteration of the cutting-plane algorithm is shown in 
figure 14.2. The idea behind Kelley's algorithm is that at each iteration the lower 
bound function <j) k h is refined or improved (i.e., made larger, so that <f) k h +1 > </>j. b ), 
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Figure 14.2 In the second iteration of the cutting-plane algorithm a sub- 
gradient gi at X2 is found, giving the global lower bound function <f) 2 h . (14.7) 
is solved to give L2 and X3 = x\. (The next iteration of the cutting- plane 
algorithm is shown in figure 14.1.) 



since one more term is added to the maximum in (14.5). Moreover, since <f)\ h is tight 
at xi, . . . ,Xk, it is a good approximation to <j> near x\, . . . , xu- In the next section, 
we will use this idea to show that the cutting-plane algorithm always terminates. 

The cutting-plane algorithm maintains upper and lower bounds on the quantity 
being computed: 

Lk < 4>* < u k , 

which moreover converge as the algorithm proceeds: 



Uk — Lk — ► as k 



0. 



Thus we compute <j>* to a guaranteed accuracy of e: on exit, we have a point with 
a low function value, and in addition we have a proof that there are no points with 
function value more than e better than that of our point. Stopping criteria for 
general optimization algorithms {e.g., descent methods) are often more heuristic — 
they cannot guarantee that on exit, <f>* has been computed to a given accuracy. 

Provided <j>* ^ 0, it is also possible to specify a maximum relative error (as 
opposed to absolute error), with the modified stopping criterion 

until {U k - L k < emin{|L fc |, \U k \} ); 

which guarantees a relative accuracy of at least e on exit. These stopping criteria can 
offer a great advantage when the accuracy required is relatively low, for example, 
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10%. This relative accuracy can be achieved long before the objective values or 
iterates Xk appear to be converging; still, we can confidently halt the algorithm. 

A valid criticism of the cutting-plane algorithm is that the number of constraints 
in the linear program (14.7) that must be solved at each iteration grows with the to- 
tal number of elapsed iterations. In practice, if these linear programs are initialized 
at the previous point, they can be solved very rapidly, e.g., in a few simplex itera- 
tions. Some cutting-plane algorithms developed since Kelley's drop constraints, so 
the size of the linear programs to be solved does not grow as the algorithm proceeds; 
see the Notes and References at end of this chapter. 

While the cutting-plane algorithm makes use of all of the information (<f)(xi), gi, 
i = l,...,k) that we have obtained about <j> in previous iterations, the ellipsoid 
methods that we will describe later in this chapter maintain a data structure of 
constant size (at the cost of an approximation) that describes what we have learned 
about the function in past iterations. 

14.3.3 Proof of Convergence 

We noted above that when the cutting-plane algorithm terminates, we know that 
the optimum objective <j>* lies between the bounds L and U, which differ by less 
than e. In this section we show that the cutting-plane algorithm does in fact always 
terminate. 

Let -Bi n i t denote the initial box, and 

G = sup || 5 ||. 
g e d<t>(z) 

z £ -Binit 

(G can be shown to be finite.) Suppose that for k = 1, . . . , K the algorithm has not 
terminated, i.e. Uk — Lk > e for k = 1, . . . , K . Since 

L k = 4> k h {x k+ i) = max (<f>{xj) +gj{x k+ i - Xj)) 

j<k 

we have 

L k > <f){xj) + gj{x k+ i - Xj), 1 < j <k < K, 
and hence using <f)(xj) > Uk and the Cauchy-Schwarz inequality 

Lk>U k - G\\x k+ i - Xj\\, 1 < j <k <K. 
From this and Uk — Lk > e for k < K we conclude that 

IK " x i\\ > q h 3 <K, i^ j, (14.10) 

in other words, the minimum distance between any two of the points x\,...,Xk 
exceeds e/G. 
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A volume argument can now be used to show that K cannot be too large. 
Around each x\ we place a ball B\ of diameter e/G. By (14.10), these balls do not 
intersect, so their total volume is K times the volume of one ball. These balls are 
all contained in a box B which is the original box -Bi n it enlarged in every dimension 
by e/2G. Hence the total volume of the balls must be less than the volume of B. 
We conclude that K is no larger than the volume of B divided by the volume of 
one of the balls. 

If we take if m ax = vol(B)/ vol(Bi), then within X max iterations, the cutting- 
plane algorithm terminates. (We comment that this upper bound is a very poor 
bound, vastly greater than the typical number of iterations required.) 

14.3.4 Cutting-Plane Algorithm with Constraints 

The cutting-plane algorithm of the previous section can be modified in many ways 
to handle the constrained optimization problem (14.2). We will show one simple 
method, which uses the same basic idea of forming a piecewise linear lower bound 
approximation of a convex function based on the function values and subgradients 
already evaluated. 

Suppose we have computed function values and at least one subgradient at 
xi, . . . ,Xk for both the objective and the constraint function: 

</>(zi), . . . , </>{x k ), g x 6 d<t>{xi), . . . , g k G d<f>{x k ), 
ipixi), ..., i>{x k ), /ii e dtp{x 1 ), ...,h k e dip(x k ). 

These points Xi need not be feasible. 

We form piecewise linear lower bound functions for both the objective and the 
constraint: <p k b in (14.5) and 

xP k h (z) ± max (iftxi) + hj(z - x t )) , (14.11) 

which satisfies ip k h (z) < ip(z) for all z 6 R n . 

The lower bound function i^ k h yields a polyhedral outer approximation to the 
feasible set: 

{z |VW<0}C{z | VL b (*)<0}. (14.12) 

Thus we have the following lower bound on <j>*: 

4>*>L k ± min {^(z) | i,\ h {z) < } . (14.13) 

As in section 14.3.1, the optimization problem (14.13) is equivalent to a linear 
program: 

L k = min c T w (14.14) 

Aw < b 



14.3 Cutting-Plane Algorithms 



319 



where 



w = 



z 
L 



c = 



, A = 



T 

Si 



T 

9k 

K 



L K 







b = 



T 

91*1 



T 

9 k x k 
n 1 x\ 



H x i) 
4>{xk) 



_ h k x k - ip{x k ) 



We note again that the lower bound L k in (14.13) can be computed no matter how 
the points and subgradients were chosen. 

The modified cutting-plane algorithm is exactly the same as the cutting-plane 
algorithm, except that the linear program (14.14) is solved instead of (14.8), and 
the stopping criterion must be modified for reasons that we now consider. 

If the feasible set is empty then eventually the linear program (14.14) will be- 
come infeasible. When this occurs, the cutting-plane algorithm terminates with the 
conclusion that the feasible set is empty. On the other hand, if the feasible set is not 
empty, and the optimum occurs on the boundary of the feasible set, which is often 
the case, then the iterates Xk are generally infeasible, since they lie in the outer 
approximation of the feasible set given by the right-hand side of (14.12). However, 
they approach feasibility (and optimality): it can be shown that 



lim 4>{xk) = 4>* 

k— >oo 



rimsup^>(a;fc) < 0. 

k— >oo 



The second inequality means that given any ef eas > we eventually have i^{x k ) < 

^feas ■ 

This has an important consequence for the upper bound and stopping criterion 
described in section 14.3.2. If x^ is not feasible, then <f)(xk) is not an upper bound 
on <j>* . Therefore we cannot use (14.9) to compute an upper bound on <j>* . A good 
stopping criterion for the modified cutting-plane algorithm is: 

until ( il){x k ) < efeas and (f)(x k ) - L k < e obj ); 

We interpret ef eas as a feasibility tolerance and e bj as an objective tolerance. 
When the algorithm successfully terminates we are guaranteed to have found a 
point that is feasible and has objective value within e bj of optimal for the ef eas - 
relaxed problem 

min{(f)(z) | il)(z) < e feas }, 

(but probably not feasible for the problem (14.2).) 

Other modifications of the cutting-plane algorithm generate only feasible iter- 
ates; see the Notes and References at the end of this chapter. 
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14.3.5 Example 

In this example we will use the cutting-plane algorithm to minimize a convex func- 
tion on R . We will use the standard example plant that we introduced in sec- 
tion 2.4. 

The function to be minimized is 



<t> 



a 

P 



= Vwt_max(", P), 



where the function 



<£wt_max(a,/3) = max{v? pk _ trk (a,/3), 0.5(p 



maxjsens 



(a,/3), 15v? rms ^ p (a, j3)} 



was defined in section 11.7. The level curves of v? w t_max are plotted in figure 11.23. 
The minimum value of v?wt_max is 0.737, which occurs at x* = [0.05 0.09] T . 

The bounding box for the cutting-plane algorithm was z m i n = [—0.9 — 0.9] T and 
2max = [1-9 1.9] T . This was an aesthetic choice; in a real problem the bounding 
box would be much larger. The starting point, Xi = [0.5 0.5] T , is the center of the 
bounding box. The upper and lower bounds versus iteration number are shown in 
figure 14.3. The maximum relative error, i.e., (Uk — Lk)/Lk, is shown in figure 14.4. 
Level curves of <p k h are shown for k = 1, 2, 3, 4 in figures 14.5, 14.6, 14.7, and 14.8. 
The reader is encouraged to trace the execution of the algorithm through these 
figures, and compare the level curves of (f>\ h with those of v? w t_max in figure 11.23. 



-e- 0.7 




4 5 

iteration, k 
Figure 14.3 Upper and lower bounds on the solution <f>* , as a function of 
the iteration number k, are shown for the cutting-plane algorithm. 
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iteration, k 
Figure 14.4 For the cutting-plane algorithm the maximum relative error, 
defined as (Uk — Lu)/ Lu, falls below 0.01% by iteration 9. 
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Figure 14.5 The cutting-plane algorithm is started at the point x\ = 
[0.5 0.5] T . The subgradient g\ at x\ gives an affine global lower bound 
0! for 4>. The level curves of fa are shown, together with the solution X2 
to (14.7). This point gives the lower bound L\ = —1.12. (The dashed line 
shows the bounding box, and x* is the minimizer of ¥>wt_max-) 
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<3CL 0.5 




Figure 14.6 The level curves of 0' 2 b are shown, together with the solution 
X3 to (14.7). This point gives the lower bound L2 = 0.506. 
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Figure 14.7 The level curves of 0' 3 b are shown, together with the solution 
X4 to (14.7). This point gives the lower bound L3 = 0.668. 
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Figure 14.8 The level curves of <$? are shown, together with the solution 
xc, to (14.7). This point gives the lower bound Li = 0.725. 
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14.4 Ellipsoid Algorithms 

14.4.1 Basic Ellipsoid Algorithm 

We first consider the unconstrained minimization problem (14.1). The ellipsoid al- 
gorithm generates a "decreasing" sequence of ellipsoids in R n that are guaranteed 
to contain a minimizing point, using the idea that given a subgradient (or quasigra- 
dient) at a point, we can find a half-space containing the point that is guaranteed 
not to contain any minimizers of <j> (see figure 13.2). 

Suppose that we have an ellipsoid Ek that is guaranteed to contain a minimizer 
of <j>. In the basic ellipsoid algorithm, we compute a subgradient gk of <j> at the 
center, Xk, of Ek- We then know that the "sliced" half ellipsoid 

E k n{z | gl{z-x k ) <0} 
contains a minimizer of <j), as shown in figure 14.9. 




Figure 14.9 At the fcth iteration of the ellipsoid algorithm a minimizer 
of (p is known to lie in the ellipsoid Eh centered at xu- The subgradient g% 
at xu determines a half-space (below and to the left of the dashed line) in 
which <f>(x) is at least <p(xu). Therefore a minimizer of <j) lies in the shaded 
region. 

We compute the ellipsoid Ek+i of minimum volume that contains the sliced 
half ellipsoid; Ek+i is then guaranteed to contain a minimizer of <f>, as shown in 
figure 14.10. The process is then repeated. 

We now describe the algorithm more explicitly. An ellipsoid E can be described 
as 

E={z \{z- a) T A- 1 {z - a) < 1 } 
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Figure 14.10 The shaded region in figure 14.9 is enclosed by the ellipsoid of 
smallest volume, denoted Ek+i, and centered at Xk+i- Any subgradient gu+i 
at Xh+i is found and the next iteration of the ellipsoid algorithm follows. In 
R 2 the area of Eu+i is always 77% of the area of Ek- 



where A = A T > 0. a is the center of the ellipsoid E and the matrix A gives the 
"size" and orientation of E: the square roots of the eigenvalues of A are the lengths 
of the semi-axes of E. The volume of E is given by 



vol(E) = /3 n VdetA, 
where /3 n is the volume of the unit sphere in R n ; in fact, 

W 2 



Pn = 



r(n/2 + l)' 



but we will not need this result. 

The minimum volume ellipsoid that contains the half ellipsoid 

{z \{z- a) T A- 1 {z - a) < 1, g T {z - a) < 0} 

is given by 

E=(z \{z- a) T A- 1 {z - a) < 1 } , 

where 

Ag 



a = a 



n + r 



A = 



n 



n c 



n + 1 



Agg'A , 



(14.15) 
(14.16) 
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and 



9 = 9 J V9 T Ag 



is a normalized subgradient. (Note that the sliced ellipsoid depends only on the 
direction of g, and not its length.) 

Thus, the basic ellipsoid algorithm is: 

x \i Ai <— any initial ellipsoid that contains minimizers; 

k^O; 

repeat { 

k <- k + 1; 

evaluate <f)(xk) and compute any gu € d<j){xk); 

9 <- 9k J \j9k A k9k ; 
x k+1 «- x k - A k g/(n + 1); 
A k+ i ^^[A h - ^A k gg T A k ); 
} until ( stopping criterion ); 

Prom (14.16), we can think of E k+i as slightly thinner than E k in the direction 
g k , and slightly enlarged over all. Even though E k+i can be larger than E k in the 
sense of maximum semi-axis (A max (^4fc + i) > A max (^4j.) is possible), it turns out that 
its volume is less: 

/ \("+l)/2 , x (n-l)/2 

VOl(E k + 1 ) = [—) ^-^J VO\(E k ) (14.17) 

< e"^ vol(£; fc ), (14.18) 

by a factor that only depends on the dimension n. We will use this fact in the next 
section to show that the ellipsoid algorithm converges, i.e., 

lim (f>(x k ) = (/>*, 

k—*oo 

provided our original ellipsoid E\ (which is often a large ball, i.e. X\ = 0, A\ = R 2 I) 
contains a minimizing point in its interior. 

Since we always know that there is a minimizer z* € E k , we have 

4>*=<!>{z*)><j>{x k )+gl{z*-x k ) 

for some z* € E k , and hence 

<t>{x k ) - 4>* < -g k {z* - x k ) 

< max-gTU - x k ) 

z€Ek 



= y9k A k9k- 

Thus the simple stopping criterion 



14.4 Ellipsoid Algorithms 327 



until [ ^glA k g k < e j; 



guarantees that on exit, (f>(xk) is within e of </>*. A more sophisticated stopping 
criterion is 

until ( Uk — Lk < e ); 

where 

U k = min 4>(xi), L k = max I (f)(xi) - J gj A^ I . (14.19) 

l<i<k l<i<k \ / 

While the ellipsoid algorithm works for quasiconvex functions (with the g^s 
quasigradients), this stopping criterion does not. 

14.4.2 Proof of Convergence 

In this section we show that the ellipsoid algorithm converges. We suppose that 
z* £ E\ and that for k = 1, . . . , K, <f){xk) ></>*+ e, where e > 0. Then every point 
z excluded in iterations 1, . . . , K has (f)(z) > <f>* + e, since at iteration k the function 
values in each excluded half-space exceed <f)(xk). If 

G = max ||#|| 
g e d<j>(x) 

x e Ei 

is the maximum length of the subgradients over the initial ellipsoid, then we find 
that in the ball 

B = {z | ||.z-.z*|| <e/G} 

we have <f)(z) <(/>*+ e (we assume without loss of generality that B C Ei), and 
consequently no point of B was excluded in iterations 1, . . . , K , so that in fact 

BCE k . 
Thus, vol(Ek) > vol(B), so using (14.17-14.18), 

e-fevol( J E; 1 )>(e/G) n /3 n . 

For E 1 = {z | ||z|| < R} we have vol(^!) = R n /3 n , so, taking logs, 

K , „ , e 

-— +nlogi? > nlog— , 

2n G 

and therefore 

K < 2n 2 log — . 
e 

Thus to compute <f>* with error at most e, it takes no more than 2n 2 log RG/e 
iterations of the ellipsoid algorithm; this number grows slowly with both dimension 
n and accuracy e. We will return to this important point in section 14.6. 
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14.4.3 Ellipsoid Algorithm with Constraints 

The basic ellipsoid algorithm is readily modified to solve the constrained prob- 
lem (14.2). In this section we describe one such modification. 

Once again, we generate a sequence of ellipsoids of decreasing volume, each of 
which is guaranteed to contain a feasible minimizer. If x k is feasible {^>(x k ) < 0) 
then we form E k+i exactly as in the basic ellipsoid algorithm; we call this an 
objective iteration. If x k is infeasible {^>(x k ) > 0) then we form E k+1 as in the 
basic ellipsoid algorithm, but using a subgradient of the constraint instead of the 
objective. We call this a constraint iteration. 

The algorithm is thus: 

x \i Ai <— an ellipsoid that contains feasible minimizers (if there are any); 

k^O; 

repeat { 

k «- k + 1; 
compute i^(x k ); 
if(1>(x k )>0) { 

/* x k is infeasible */ 
compute any h k € di^(x k ); 

9 <- h k J \JhlA k h k ; 

if ( i>{xk) ~ ^JhlA k h k > ) { 

quit because the feasible set is empty; 

} 
} else { 

/* Xk is feasible */ 

compute (f)(xk) and any g^ 6 d(f>(xk); 

9 *~ 9k J \j9k A k9k; 

} 

x k+1 <- x k - A k g/(n + 1); 

Ak+i <- ^i (A k - ^Akg^Ak); 
} until ( ip{x k ) < and Jg k A k g k < e J; 

In a constraint iteration, the points we discard are all infeasible. In an objective 
iteration, the points we discard all have objective value greater than or equal to the 
current, feasible point. Thus in each case, we do not discard any minimizers, so that 
the ellipsoids will always contain any minimizers that are in the initial ellipsoid. 

The same proof as for the basic ellipsoid algorithm shows that this modified 
algorithm works provided the set of points that are feasible and have nearly op- 
timal objective value has positive volume. More sophisticated variations work in 
other cases (e.g., equality constraints). Alternatively, if we allow slightly violated 
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constraints, we can use the stopping criterion 

until [ ip(x k ) < e f eas and <Jg%A k g k < e obj J; 

With this stopping criterion, the modified algorithm will work even when the set of 
feasible points with nearly optimal objective value does not have positive volume, 
e.g., with equality constraints. In this case the modified algorithm produces nearly 
optimal points for the e-relaxed problem, just as in the modified cutting-plane 
algorithm. 

14.4.4 Deep-Cut Ellipsoid Algorithm for Inequality Specifications 

A simple variation of the ellipsoid algorithm can be used to solve the feasibility 
problem (14.3). This modified algorithm often performs better than the ellipsoid 
algorithm applied to the constraint function t[>. 

The idea is based on figure 13.3: suppose we are given an x that is not feasible, 
so that i)(x) > 0. Given h G dtj)(x), we have for all z 

ip{z) > ip{x) + h T {z - x) 

so for every feasible point Zf eas we have 

h T {z teas ~X) < -1p{x). 

We can therefore exclude from consideration the half-space 

{z | h T (z - x) > -ip(x)} 

which is bigger (ip(x) > 0) than the half-space {z \ h T (z — x) > 0} excluded in the 
ellipsoid algorithm, as shown in figure 14.11. 

In the modified ellipsoid algorithm, we maintain ellipsoids guaranteed to contain 
a feasible point (if there is one), as shown in figures 14.11 and 14.12. If x k is not 
feasible (ip(x k ) > 0), we let E k+1 be the minimum volume ellipsoid that contains 
the set 

S k = E k n {z | hl(z - x k ) < -ip{x k ) } . 

If S k = 0, we know that there are no feasible points. This happens if and only if 



yhlA k h k < t/){x k ). 




Otherwise, E k+ \ is given by 




1 + na - 
x k +i = x k A k n k 
n + \ 




n 2 ( 
^+ 1 = n 2_ 1 ( 1 a2 )[ A *- 


2(1 + na) 
(n + l)(l + a) 



(14.20) 
AjiJilAA , (14.21) 
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Figure 14.11 At the fcth iteration of the deep-cut ellipsoid algorithm any 
feasible point must lie in the ellipsoid Eh centered at xu- If xu is infeasible 
(meaning ip(xk) > 0) any subgradient hu o£ip at xu determines a half-space 
(below and to the left of the dotted line) in which ip(x) is known to be 
positive. Thus any feasible point is now known to lie in the shaded region. 




Figure 14.12 The shaded region in figure 14.11 is enclosed by the ellipsoid 
of smallest volume, denoted Ek+i, and centered at Xh+i- If the point Xk+i 
is feasible the algorithm terminates. Otherwise, any subgradient hk+i of rp 
at Xk+i is found. The algorithm then proceeds as shown in figure 14.11. 
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where 



a = 



h k = 



1p{Xk) 



^JhlA k h k 
h k 



\/ h l A k 



hi 



In forming Ek+i, we cut out more of the ellipsoid Ek than in the basic ellipsoid 
method, so this algorithm is called a deep-cut ellipsoid method. The deep-cut 
ellipsoid algorithm is thus: 

ail, A\ < — any ellipsoid that contains feasible points (if there are any); 

k<-l; 

repeat { 

compute ip(xk) and any hk € d'4>{xk); 

if(il>(x k )<0) { 

done: Xk is feasible; 

} 

if y tp{x k ) > ^hjAkhk J { 

quit: the feasible set is empty; 

} 

a «- tp{x k ) J \jh,TA k h k ; 

hk <- h k Jh\Akhk; 



x k+ i <^x k - ^^-Akhk; 

A k+1 - ^r(l - a 2 ) (A k - J ^0_ Ak h k hlA k ); 
k <- k + 1; 
} 

Deep-cuts can be used for the constraint iterations in the modified ellipsoid 
algorithm for the constrained problem; they can also be used for the objective 
iterations in the ellipsoid algorithm for the unconstrained or constrained problems, 
as follows. At iteration k, it is known that the optimum function value does not 
exceed Uk, so we can cut out the half-space in which the function must exceed Uk, 
which is often larger (if Uk < 4>(xk)) than the half-space in which the function must 
exceed <f){xk). 

14.4.5 Example 

We use the same function, v ? wt_max ) that we used in the cutting-plane example in 
section 14.3.5. The algorithm was started with E\ set to a circle of radius 3/2 
about X\ = [0.5 0.5] T . Ellipsoid volume is shown in figure 14.13. Upper and lower 
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bounds versus iteration number are shown in figure 14.14. The worst case relative 
error is shown in figure 14.15. The ellipsoid Ek is shown at various iterations in 
figure 14.16. 



<v 
ha 

> 




20 30 40 50 60 

iteration, k 
Figure 14.13 The volume of the ellipsoid Eu decreases exponentially with 
the iteration number k. 

For the deep-cut algorithm we consider the inequality specifications 

Vpk_trk(a, 0) < 0.75 V'max^ensta, 0) < 1.5 V?rms_yp(a, P) < 0.05. 

using the constraint function 

.(a, P) -1.5 



i, 



a 

P 



A f Vpk_trk(a, P) ~ 0.75 ip : 

= max J — — 



■V 



max_sens V 



0.75 ' 1.5 

Vrms^p(a, /3) - 0.05' 



0.05 



I 



(14.22) 



The deep-cut ellipsoid algorithm finds a feasible point in 8 iterations. The execution 
of the algorithm is traced in table 14.1 and figure 14.17. Figure 14.17 also shows 
the set of feasible points, which coincides with the set where V'wt_max(a,/3) < 0.75. 



14.5 Example: LQG Weight Selection via Duality 

In this section we demonstrate some of the methods described in this chapter on 
the problem of weight selection for an LQG controller design. We consider the 
multicriterion LQG problem formulated in section 12.2.1, and will use the notation 
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20 30 40 50 60 

iteration, k 
Figure 14.14 Upper and lower bounds on the solution <f>* , as a function 
of the iteration number k, for the ellipsoid algorithm. 
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£ 
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10" 



10" 




60 



20 30 

iteration, k 
Figure 14.15 For the ellipsoid algorithm the maximum relative error, de- 
fined as (Uh — Lk)/ Lu, falls below 0.01% by iteration 54. 
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QCl 0.5 




Figure 14.16 The initial ellipsoid, E\, is a circle of radius 1.5 centered at 
[0.5 0.5] T . The ellipsoids at the 7th, 17th, and 22nd iterations are shown, 
together with the optimum x* . 



<22. 0.5 - 




Figure 14.17 The initial ellipsoid, E\ is a circle of radius 1.5 centered at 
x\ = [0.5 0.5] T . After eight iterations of the deep-cut ellipsoid algorithm 
a feasible point x& is found for the constraint function (14.22). The set of 
feasible points for these specifications is shaded. Note that each ellipsoid 
contains the entire feasible set. See also table 14.1. 
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k 


x k 


ip{x k ) 


</>pk-trk 
< 0.75? 


ymax_sens 

< 1.5? 


yrms_yp 

< 0.05? 


action 


1 


0.500 
0.500 


0.902 


0.643 

yes 


2.853 
no 


0.0372 
yes 


CUt <^max_sens 


2 


0.524 
-0.256 


0.160 


0.858 
no 


1.739 
no 


0.0543 
no 


CUt <pmax_sens 


3 


-0.160 
0.050 


0.051 


0.692 

yes 


1.489 
yes 


0.0526 
no 


CUt <^rms_yp 


4 


0.344 
0.274 


0.384 


0.661 

yes 


2.075 
no 


0.0422 
yes 


CUt <^max_sens 


5 


0.169 
0.021 


0.009 


0.719 

yes 


1.514 
no 


0.0498 
yes 


CUt <pmax_sens 


6 


-0.129 
0.051 


0.043 


0.692 

yes 


1.488 
yes 


0.0522 
no 


CUt <^rms_yp 


7 


0.236 
0.122 


0.041 


0.693 

yes 


1.561 
no 


0.0466 

yes 


CUt <pmax_sens 


8 


0.206 
0.065 


-0.001 


0.708 
yes 


1.498 
yes 


0.0483 
yes 


done 



Table 14.1 The steps performed by the deep-cut ellipsoid algorithm for 
the constraint function (14.22). At each iteration a deep-cut was done using 
the function that had the largest normalized constraint violation. See also 
figure 14.17. 



defined there. The specifications are realizability and limits on the RMS values of 
the components of z, i.e., 



RMS(zi) < yfc, ..., RMS(zi) < y/a£, 



(14.23) 



(with the particular power spectral density matrix for w described in section 12.2.1). 
Each Zi is either an actuator signal or some linear combination of the system state. 
A will denote the set of a € R + that corresponds to achievable specifications of the 
form (14.23). 

We will describe an algorithm that solves the feasibility problem for this family 
of specifications, i.e., determines whether or not a E A, and if so, finds a controller 
that achieves the given specifications. This problem can usually be solved by a 
skilled designer using ad hoc weight adjustment and LQG design (see section 3.7.2), 
but we will describe an organized algorithm that cannot fail. 

By the convex duality principle (see section 6.6; the technical condition holds in 
this case), we know that 

a E A -<=>■ there is no A > with tp(X) > a X. 
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Since this condition on A is not affected by positive scaling, we may assume that the 
sum of the weights is one. We can thus pose our problem in terms of the following 
convex optimization problem: 



a = 



Ai + 



min 
A> 
■ ■ ■ + A L = 1 



a T \ - -0(A). 



(14.24) 



Let A* denote a minimizer (which in fact is unique) for the problem (14.24). Then 
we have: 



a e A<i=>- a > 0, 



(14.25) 



and moreover if a > 0, then the LQG design that corresponds to weights A* achieves 
the specification (see section 12.2.1). So finding a and A* will completely solve our 
problem. 

The equivalence (14.25) can be given a simple interpretation. a T \ is the LQG 
cost, using the weights A, that a design corresponding to a would achieve, if it were 
feasible. ip(\) is simply the smallest achievable LQG cost, using weights A. It follows 
that if a T X < ip(X), then the specification corresponding to a is not achievable, 
since it would beat the optimal LQG design. Thus, we can interpret (14.24) as the 
problem of finding the LQG weights such that the LQG cost of a design that just 
meets the specification a most favorably compares to the LQG-optimal cost. 

We can evaluate a subgradient of a T X — t[>(\) (in fact, it is differentiable) using 
the formula given in section 13.4.8: 



0l(H] 



iqg,Aj 



. <M-fflqg,A) 



e d (a T X - V>(A)) , 



(14.26) 



where -ffi qg ,A is the LQG-optimal design for the weights A. We can therefore 
solve (14.24) using any of the algorithms described in this chapter. We will give 
some of the details for the ellipsoid algorithm. 

We can handle the equality constraint Ai + • • • + A^ = 1 by letting 

Ai 



x = 



A 



L-l 



be our optimization variables, and setting 

Al = 1 — Ai — • • • — \l-i- 
The inequality constraint on x is then 

max {-X!, ..., -x L -!, xx-\ \- x L -\ 



1} <0. 



(14.27) 
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A subgradient j£R for the objective is readily derived from (14.26): 
a ± - <f)i{H\ qgt x) - a L + (/>i(i?iqg,A) 



. fli-l - <f>L-l{H\ qg ,x) - a L + </>i(-ffl qg ,A) 

where -ffi qg ,A is optimal for the current weights. 

As initial ellipsoid we take the ball of radius one centered at the weight vector 



x™ = 



1/L 
ll/L 



Since this ball contains the feasible set given by (14.27), we are guaranteed that 
every feasible minimizer of (14.24) is inside our initial ellipsoid. 

We can now directly apply the algorithm of section 14.4.3. We note a few 
simplifications to the stopping criterion. First, the feasible set is clearly not empty, 
so the algorithm will not terminate during a feasibility cut. Second, the stopping 
criterion can be: 

until ( a T X < tp(X) or <f>i(H\ qgj x) < a i f or i = 1, . . . , £ ); 

The algorithm will terminate either when a is known to be infeasible (a T X < i[>(\)), 
or when a feasible design has been found. 

This ellipsoid algorithm is guaranteed to terminate in a solution to our problem, 
except for the case when the specification a is Pareto optimal, i.e., is itself an LQG- 
optimal specification for some weight vector. In this exceptional case, i?i qg ,A and A 
will converge to the unique design and associated weights that meet the specification 
a. This exceptional case can be ruled out by adding a small tolerance to either of 
the inequalities in the stopping criterion above. 



14.5.1 Some Numerical Examples 

We now demonstrate this algorithm on the standard example plant from section 2.4, 
with the process and sensor noise power spectral densities described in section 11.2.3 
and the objectives 

<f> 1 (H) = -Eu 2 , <f> 2 (H)=-Ey 2 p , </> 3 (tf)=E^, 

which are the variance of the actuator signal, the system output signal y p , and its 
derivative. The optimization variables are the weights Ai and A 2 associated with 
the actuator and system output variance, respectively. The weight for the system 
output rate is given by 1 — Ai — A 2 . The initial ellipsoid E\ is a circle centered 
at [1/3 1/3] T of radius vo/3; it includes the triangle of feasible weights A (see 
figure 14.19). 
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specification 


RMS(u) 


RMS(j/ p ) 


RMS(y p ) 


achievable? 


1 


<0.1 


< 0.13 


< 0.04 


yes 


2 


<0.1 


< 0.07 


< 0.04 


no 


3 


<0.15 


< 0.07 


< 0.04 


yes 


4 


< 0.149 


< 0.0698 


< 0.04 


no 



Table 14.2 Four different specifications on the RMS values of u, y p , and 
2/p- 



For specifications that are either deep inside or far outside the feasible set A, 
the ellipsoid algorithm rapidly finds a set of weights for an LQG design that either 
achieves the specification or proves that the specification is infeasible. To get more 
than a few iterations, specifications close to Pareto optimal must be chosen. We 
will consider the four sets of specifications shown in table 14.2. Specifications 1 
and 3 are achievable (the latter just barely), while 2 and 4 are unachievable (the 
latter only barely). The results of running the ellipsoid algorithm on each of these 
specifications is shown in table 14.3. 

These results can be verified by checking whether each specification is above or 
below the tradeoff surface. Since each specification has the same limit on RMS(y p ), 
we can plot a tradeoff curve between RMS(j/ p ) and RMS(u) when RMS(y p ) is 
required to be below 0.04. Such a tradeoff curve is shown in figure 14.18. As 
expected, specification 1 is achievable and specification 2 is not. Specifications 3 
and 4 are very close to the tradeoff boundary, but on opposite sides. 

The execution of the ellipsoid algorithm is shown in 

• figure 14.19 for specification 1, 

• table 14.4 for specification 2, 

• figure 14.20 for specification 3, 

• figure 14.21 for specification 4. 
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quantity 


spec. 1 


spec. 2 


spec. 3 


spec. 4 


*fa 


0.1 

0.13 

0.04 


0.1 

0.07 

0.04 


0.15 
0.07 
0.04 


0.149 

0.0698 

0.04 


iterations 


23 


9 


59 


34 


t[> evaluations 


12 


6 


47 


22 


A 


0.00955 
0.00799 
0.98246 


0.01951 
0.09072 
0.88977 


0.00226 
0.02745 
0.97029 


0.00232 
0.02952 
0.96816 


Vj 


0.09817 
0.09795 
0.03986 


0.09726 
0.05888 
0.04202 


0.14922 
0.06997 
0.03999 


0.14896 
0.06887 
0.04005 


V-(A) 


0.001729 


0.002070 


0.001737 


0.001745 


a T X 


0.001802 


0.002063 


0.001738 


0.001744 


exit condition 


J <a 


V>(A) > a T X 


J < a 


V>(A) > a T X 


achievable? 


yes 


no 


yes 


no 



Table 14.3 Result of running the ellipsoid algorithm on each of the four 
specifications from table 14.2. 
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RJVIS(u) 

Figure 14.18 The tradeoff curve between RMS(2/ P ) and RMS(u), with the 
specification RMS(y p ) < 0.04, is shown, together with the four specifications 
from table 14.2. Since these four specifications all require that RMS(y p ) < 
0.04, each specification will be achievable if it lies on or above the tradeoff 
curve. For comparison, the tradeoff curve between RMS(2/ P ) and RMS(w) 
with no specification on RMS(y p ) is shown with a dashed line. 
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Figure 14.19 The progression of the ellipsoid algorithm is shown for spec- 
ification 1 in table 14.2. The initial ellipsoid, Si, is a circle of radius v5/3 
centered at x^ 1 ' = [1/3 1/3] T that includes the triangular set of feasible 
weights. The ellipsoid algorithm terminates at the point x" , which corre- 
sponds to weights for which the LQG-optimal controller satisfies specifica- 
tion 1. 
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iter. 


A 


Vj 


V(A) 


a T X 


action 


1 


0.33333 
0.33333 
0.33333 


0.06065 
0.06265 
0.04789 


0.00330 


0.00550 


cut ij) 


2 


0.09163 
0.27584 
0.63254 


0.07837 
0.05383 
0.04533 


0.00266 


0.00328 


cut l\) 


3 


-0.02217 
0.11561 
0.90656 


— 


— 


— 


cut Ai 


4 


0.14809 

-0.07633 

0.92824 


— 


— 


— 


cut A2 


5 


0.09606 
0.20281 
0.70113 


0.07392 
0.05758 
0.04474 


0.00260 


0.00308 


cut l\) 


6 


-0.00277 
0.14218 
0.86059 


— 


— 


— 


cut Ai 


7 


0.10218 
0.00942 
0.88840 


0.05459 
0.11386 
0.04218 


0.00201 


0.00249 


cut l\) 


8 


0.05750 
0.20722 
0.73528 


0.08348 
0.05416 
0.04421 


0.00245 


0.00277 


cut l\) 


9 


0.01951 
0.09072 
0.88977 


0.09726 
0.05888 
0.04202 


0.00207 


0.00206 


done 



Table 14.4 Tracing the execution of the ellipsoid algorithm with specifi- 
cation 2 from table 14.2. After 9 iterations ip(X) > a T X, so specification 2 
has been proven to be unachievable. 
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Figure 14.20 The normalized values of RMS(u), RMS(2/ P ), and RMS(y p ) 
are plotted versus iteration number for specification 3 in table 14.2. These 
values are those achieved by the LQG-optimal regulator with the current 
weights A. At iteration 59 all three curves are simultaneously at or below 
1.0, so the algorithm terminates; it has found weights for which the LQG- 
optimal regulator satisfies the specifications. 
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Figure 14.21 The difference between a T \ and ip(X) is plotted for each 
iteration of the ellipsoid algorithm for specification 4 in table 14.2. At 
iteration 34, ip(X) > a T X (this point is not plotted), and the algorithm 
terminates; it has proven that the specifications are unachievable. Note 
that at iteration 30, the weights almost proved that the specification is 
unachievable. 
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14.6 Complexity of Convex Optimization 

We have seen simple algorithms that can be used to compute the global minimum of 
a convex or quasiconvex function. In fact, these optimization problems are not only 
solvable, they are intrinsically tractable: roughly speaking, they can be solved with 
a "reasonable" amount of computation. For more general optimization problems, 
while we can often compute a local minimum with reasonable computation, the 
computation of the global minimum usually requires vastly more computation, and 
so is tractable only for small problems. 

In this section we informally discuss the complexity of different types of (global) 
optimization problems, which involves the following ideas: 

• a class of problems, 

• a notion of the size of a problem and required accuracy of solution, 

• a measure of computation effort. 

The complexity describes how the minimum computation effort required depends 
on the problem size and required accuracy. 

The four classes of optimization problems that we will consider are: 

• Convex quadratic program (QP): compute 

min { x T Ax I Cx < _£?} , 

where x G R n , B 6 R fe , and A > 0. 

• General QP: compute 

min { x T Ax I Cx < S} , 

where x e R n and B 6 R fe . 

• Convex program: compute 

min f{x) 

where / : R n — > R is convex, and /C C R n is a convex set. 

• General program: compute 

min f{x) 

where / : R n -► R and K C R n . 

In the following sections we describe some measures of the relative complexities of 
these problems. 
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14.6.1 Bit Complexity of Quadratic Programs 

The problem size is measured by the number of bits required to describe the matrices 
A and C, and the vector B. The computational effort is measured by the number 
of bit operations required to find the (exact) solution. 

Convex QP's have polynomial bit complexity: they can be solved in a number 
of bit operations that grows no faster than a polynomial function of the problem 
size. This celebrated result was obtained by the Soviet mathematician Khachiyan 
in 1979 using a variation of the ellipsoid algorithm. 

In contrast, it is known that general QP's have a bit complexity that is the 
same as many other "hard" problems. The only known algorithms that solve these 
"hard" problems require a number of bit operations that grows exponentially with 
the problem size. Moreover, it is generally believed that there are no algorithms 
that solve these problems using a number of bit operations that grows only as fast 
a polynomial in the problem size. See the Notes and References. 

In conclusion, in the sense of bit complexity, convex QP's are "tractable", 
whereas no non-combinatorial algorithms for solving general QP's are known, and 
it is widely believed that none exist. 

14.6.2 Information Based Complexity 

Information based complexity is measured by the number of function and gradient 
(or subgradient) evaluations of / that are needed to compute the minimum to a 
guaranteed accuracy; information about / is only known through these function 
and gradient evaluations. The problem size is measured by the dimension n and 
required accuracy e. 

Roughly speaking, bit complexity counts all operations, and assumes that the 
function to be minimized is completely specified, whereas information based com- 
plexity counts only the number of calls to a subroutine that evaluates the function 
and a gradient. 

Many sharp bounds are known for the information based complexity of convex 
and general programs. For example, convex programs can be solved to an accuracy 
of e with no more than p(n)log(l/e) function and subgradient evaluations, where 
p is a specific polynomial. It is also known that the minimum number of function 
and gradient evaluations required to solve a general program grows like (l/e) n , i.e., 
exponentially. 

Therefore, the information based complexity of general programs is enormously 
larger than convex programs. 

14.6.3 Some Caveats 

We warn the reader that the results described above treat idealized versions of 
complexity, which may or may not describe the practical tractability. 
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For example, the simplex method for solving convex QP's works very well in 
practice, even though it is not a polynomial algorithm. The ellipsoid algorithm for 
solving convex QP's, on the other hand, is polynomial, but so far has not performed 
better in practice than the simplex method. 

As another example, we note that the information based complexity of general 
QP's and convex QP's is the same, and no bigger than a polynomial in n + k 
(because once we know what QP we must solve, no more subroutine calls to the 
function evaluator are required). In contrast, the bit complexity of convex and 
general QP's is generally considered to be enormously different, which is consistent 
with practical experience. The low information based complexity of general QP's 
is a result, roughly speaking, of local computations being "free", with "charges" 
incurred only for function evaluations. 

14.6.4 Local Versus Global Optimization 

General programs are often "solved" in practice using local, often descent, methods. 
These algorithms often converge rapidly to a local minimum of the function /. In 
many cases, this local minimum is the global minimum. But verifying that a given 
local minimum is actually global has a high computational cost. 

Local optimization methods often work well for design tasks: they rapidly find 
local minima, which in many cases are actually global, but in return, they give up 
the certainty of determining the global minimum. Often, an acceptable design can 
be found using a local optimization method. 

However, local methods cannot determine a limit of performance: global opti- 
mization is needed to confidently know that a design cannot be achieved, i.e., lies 
in the unshaded region described in section 1.4.1 of chapter 1. So the value of con- 
vexity is that not only can we find designs that we know are "good" (i.e., nearly 
Pareto optimal), we can also, with reasonable computation, determine performance 
limits. 
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Notes and References 

Descent Methods 

General descent methods are described in, e.g., Luenberger [Lue84]. General methods for 
nondifferentiable optimization are surveyed in Polak [Pol87]. Kiwiel's monograph [Kiw85] 
gives a detailed description of many descent algorithms for nondifferentiable optimization; 
see also Fukushima [Fuk84] and the references cited there. In the Soviet literature, descent 
methods are sometimes called relaxation processes; see for example the survey by Lyubich 
and Maistrovskii [LM70]. 

Specializations of general- purpose algorithms to nondifferentiable convex optimization 
include Wolfe's conjugate subgradients algorithm [Wol75] and Lemarechal's Davidon 
method for nondifferentiable objectives [Lem75]. 

Nondifferentiable Convex Optimization 

A good general reference on convex optimization is Levitin and Polyak [LP66], which treats 
the infinite- dimensional case, but considers mostly smooth problems. Good general refer- 
ences on nondifferentiable convex optimization, which describe some methods presented in 
this chapter are Akgiil [Akg84], Evtushenko [Evt85], and Demyanov and Vasilev [DV85]. 
In [Sho85], Shor describes the subgradient algorithm, a precursor of the ellipsoid method, 
and a variable- metric subgradient algorithm, which is the ellipsoid method as he originally 
developed it. 

Rockafellar's books [Roc81, Roc82] describe subgradients and convex optimization, but 
not the algorithms we have presented in this chapter. 

Cutting-Plane Methods 

The cutting-plane algorithm described in section 14.3.2 is generally attributed to Kel- 
ley [Kel60], although it is related to earlier algorithms, e.g., Cheney and Goldstein's 
method [CG59], which they call Newton's method for convex programming. In fact, Kel- 
ley's description of the algorithm is for the constrained case: he shows how to convert the 
unconstrained problem into one with constraints and linear objective. 

Convergence proofs are given in, e.g., Kelley [Kel60], Levitin and Polyak [LP66, §10], 
Demyanov and Vasilev [DV85, §3.8], and Luenberger [Lue84, P419-420]. The cutting- 
plane algorithm for constrained optimization that we presented in section 14.3.4, and a 
proof of its convergence, can be found in Demyanov and Vasilev [DV85, P420-423]. 

Elzinga and Moore [EM75] give a cutting-plane algorithm for the constrained problem 
that has two possible advantages over the one described in section 14.3.4. Their algorithm 
generates feasible iterates, so we can use the simple stopping criteria used in the algorithm 
for the unconstrained problem, and we do not have to accept "slightly violated constraints". 
Their algorithm also drops old constraints, so that the linear programs solved at each 
iteration do not grow in size as the algorithm proceeds. Demyanov and Vasilev [DV85, 
§3.9-10] describe the extremum basis method, a cutting-plane algorithm that maintains a 
fixed number of constraints in the linear program. See also Gonzaga and Polak [GP79]. 

Ellipsoid Methods 

Two early articles that give algorithms for minimizing a quasiconvex function using the 
idea that a quasigradient evaluated at a point rules out a half-space containing the point 
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are Newman [New65] and Levin [Lev65]. In these precursors of the ellipsoid algorithm, 
the complexity of the set which is known to contain a minimizer increases as the algorithm 
proceeds, so that the computation per iteration grows, as in cutting-plane methods. 

The ellipsoid algorithm was developed in the 1970's in the Soviet Union by Shor, Yudin, 
and Nemirovsky. A detailed history of its development, including English and Russian 
references, appears in chapter 3 of Akgiil [Akg84]. It was used in 1979 by Khachiyan 
in his famous proof that linear programs can be solved in polynomial time; an English 
translation appears in [Kha79] (see also Gacs and Lovazs [GL81]). 

The 1981 survey by Bland, Goldfarb, and Todd [BGT81] contains a very clear description 
of the method and has extensive references on its development and early history, concen- 
trating on its application to linear programming. Our exposition follows Chapter 3 of the 
book by Grotschel, Lovasz, and Schrijver [GLS88]. In [Gof83, Gof84], Goffin gives an 
interpretation of the ellipsoid algorithm as a variable- metric descent algorithm. 

Ecker and Kupferschmid [EK83, EK85] describe the results of extensive numerical tests 
on a large number of benchmark optimization problems, comparing an ellipsoid method to 
other optimization algorithms. Some of these problems are not convex, but the ellipsoid 
method seems to have done very well (i.e., found feasible points with low function values) 
even though it was not designed for nonconvex optimization. In the paper [KME85], 
Kupferschmid et al. describe an application of an ellipsoid algorithm to the nonconvex, 
but important, problem of feedback gain optimization. 

Initializing the Cutting-Plane and Ellipsoid Algorithms 

In many cases we can determine, a priori, a box or ellipsoid that is guaranteed to contain 
a minimizer; we saw an example in the LQG weight selection problem. 

For the cutting-plane algorithm, if we do not know an initial box that is guaranteed to 
contain a minimizer, we can guess an initial box; if xu stays on the box boundary for too 
many iterations, then we increase the box size and continue. (By continue, we mean that 
all of the information gathered in previous function and subgradient evaluations can be 
kept.) Roughly speaking, once xu lands inside the box, we can be certain that our initial 
box was large enough (provided the Lagrange multipliers in the linear program are positive 
at the iterate inside the box). 

In many cases the ellipsoid algorithm converges to a minimizer even if no minimizers were 
inside the initial ellipsoid, although of course there is no guarantee that this will happen. 
Of course, this can only occur because at each iteration we include in our new ellipsoid 
some points that were not in the previous ellipsoid. These new points normally represent 
"wasted" ellipsoid volume, since we are including points that are known not to include a 
minimizer. But in the case where the minimizer lies outside the initial ellipsoid, these new 
points allow the ellipsoid to twist around so as to include a minimizer. 

If an iterate xu falls outside the initial ellipsoid, it is good practice to restart the algorithm 
with a larger ellipsoid. We have found a very slow rise in the total number of ellipsoid 
algorithm iterations required as the size of the initial ellipsoid is increased, despite the fact 
that the volume of the initial ellipsoid increases very rapidly with its semi-axes. 

LQG Weight Selection 

The informal method of adjusting weights is used extensively in LQG design. In the 
LQG problem, the maximum value rule-of-thumb for initial weights is often referred to as 
Bryson's rule [BH75]. 
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In [HS78], Harvey and Stein state, 

To use LQG methods, the designer must reduce all of his varied performance 
requirements to a single criterion which is constrained to be quadratic in state 
and controls. So little is known about the relationship between such specific 
criteria and more general control design specifications that the designer must 
invariably resort to trial and error iterations. 

In Stein [Ste79], we find 

Probably the most important area about which better understanding [of LQG 
weight selection] is needed is the relationship between the weighting parame- 
ters selected for the basic scalar performance index and the resulting regulator 
properties. Practitioners have wrestled with this relationship for nearly two 
decades, trying various intuitive ways to select a "good set of weights" to 
satisfy various design specifications. 

We have observed that the mapping from LQG weights into the resulting optimal LQG 
cost, i.e., the dual function ip, is concave. We exploited this property to design an algorithm 
that solves the LQG multicriterion feasibility problem. 

We were originally unaware of any published algorithm that solves this feasibility prob- 
lem, but recently discovered an article by Toivonen and Makila, [TM89], which uses a 
quadratically convergent method to minimize a T \ — ip(X) over A > 0. 

The deep-cut algorithm of section 14.4.4 can be used with the inequality specification 
a T X — ip(X) > 0. Such an algorithm will determine whether the RMS specifications are 
achievable or not, but, unlike the algorithm we described, might not find a design meeting 
the specification when the specifications are achievable. 

Complexity of Convex Optimization 

Bit complexity is described in, for example, the book by Garey and Johnson [GJ79]. The 
"hard" problems we referred to in section 14.6.1 are called NP-complete. The results on bit 
complexity of quadratic programs are described in Pardalos and Rosen [PR87]. Pardalos 
and Rosen also describe some methods for solving general QP's; these methods require 
large computation times on supercomputers, whereas very large convex QP's are readily 
solved on much smaller computers. 

The results on the information based complexity of optimization problems are due to 
Nemirovsky and Yudin, and described in the clear and well written book [NY83] and 
article [YN77]. Our description of the problem is not complete: the problems considered 
must also have some known bound on the size of a subgradient over /C (which we used in 
our proofs of convergence). 



Chapter 15 

Solving the Controller Design 
Problem 



In this chapter we describe methods for forming and solving finite-dimensional 
approximations to the controller design problem. A method based on the 
parametrization described in chapter 7 yields an inner approximation of the re- 
gion of achievable specifications in performance space. For some problems, an 
outer approximation of this region can be found by considering a dual problem. 
By forming both approximations, the controller design problem can be solved to 
an arbitrary, and guaranteed, accuracy. 

In chapter 3 we argued that many approaches to controller design could be described 
in terms of a family of design specifications that is parametrized by a performance 
vector a G R , 

H satisfies 2> har d, <f>i{H) < a u . . . , 4>l{H) < a L . (15-1) 

Some of these specifications are unachievable; the designer must choose among 
the specifications that are achievable. In terms of the performance vectors, the 
designer must choose an a 6 A, where A denotes the set of performance vectors that 
correspond to achievable specifications of the form (15.1). We noted in chapter 3 
that the actual controller design problem can take several specific forms, e.g., a 
constrained optimization problem with weighted-sum or weighted-max objective, 
or a simple feasibility problem. 

In chapters 7-11 we found that in many controller design problems, the hard 
constraint 2>hard is convex (or even affine) and the functionals (pi, ■ ■ ■ ,<Pl are convex; 
we refer to these as convex controller design problems. We refer to a controller 
design problem in which one or more of the functionals is quasiconvex but not 
convex as a quasiconvex controller design problem. These controller design problems 
can be considered convex (or quasiconvex) optimization problems over 7{; since % 
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has infinite dimension, the algorithms described in the previous chapter cannot be 
directly applied. 



15.1 Ritz Approximations 



The Ritz method for solving infinite-dimensional optimization problems consists of 
solving the problem over larger and larger finite-dimensional subsets. For the con- 
troller design problem, the Ritz approximation method is determined by a sequence 
of n z x n,„ transfer matrices 



We let 



Ro, Ri, 


i?2j • • • G ft- 


Hn = < 


Ro + ^2 XiRi 




l<i<N 



(15.2) 



Xi G R, 1 < i < N 



denote the finite-dimensional affine subset of Ji that is determined by R and the 
next N transfer matrices in the sequence. The iVth Ritz approximation to the 
family of design specifications (15.1) is then 



H satisfies 2> har d, 4>i(H) < a u . . . , 4>l{H) < a L , He 7i N . 



(15.3) 



The Ritz approximation yields a convex (or quasiconvex) controller design problem, 
if the original controller problem is convex (or quasiconvex), since it is the original 
problem with the affine specification H G Wn adjoined. 

The iVth Ritz approximation to the controller design problem can be considered 
a finite-dimensional optimization problem, so the algorithms described in chapter 14 



can be applied. With each x G R we associate the transfer matrix 

A 

-Ko + 

Ki<N 



H N (x) = R + ^2 ^Ri, 



(15.4) 



with each functional (pi we associate the function <j>. 

4>f\x)^MHN(x)), 
and we define 

X>( > = {x | Hn(x) Satisfies Dhard} • 



(AT) 



R 



N 



R given by 



(15.5) 



(15.6) 



Since the mapping from x G R into 7i given by (15.4) is affine, the functions 
(pi given by (15.5) are convex (or quasiconvex) if the functionals <f>i are; similarly 
the subsets DW C R are convex (or affine) if the hard constraint Dhard is- I n 
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section 13.5 we showed how to compute subgradients of the functions (f>\ , given 
subgradients of the functionals (pi. 

Let An denote the set of performance vectors that correspond to achievable 
specifications for the iVth Ritz approximation (15.3). Then we have 

•Ai C • • • C A N C ■ ■ • C A, 

i.e., the Ritz approximations yield inner or conservative approximations of the 
region of achievable specifications in performance space. 

If the sequence (15.2) is chosen well, and the family of specifications (15.1) is 
well behaved, then the approximations An should in some sense converge to A as 
N — ► oo. There are many conditions known that guarantee this convergence; see 
the Notes and References at the end of this chapter. 

We note that the specification 7^ s iice of chapter 11 corresponds to the N = 2 
Ritz approximation: 

Ro=H( c \ R 1 =H^-H^ c \ R 2 = tf (b) - tf (c) . 

15.1.1 A Specific Ritz Approximation Method 

A specific method for forming Ritz approximations is based on the parametriza- 
tion of closed-loop transfer matrices achievable by stabilizing controllers (see sec- 
tion 7.2.6): 

Stable = {3i + T 2 QT 3 | Q stable } . (15.7) 

We choose a sequence of stable n u x n y transfer matrices Qi, Q 2 , ■ ■ ■ and form 

R = T 1 , R k =T 2 Q k T 3 , A; = 1,2,... (15.8) 

as our Ritz sequence. Then we have JIn ^ "^stabio i-e., we have automatically 
taken care of the specification Testable- 

To each x 6 R there corresponds the controller Kn{x) that achieves the closed- 
loop transfer matrix Hn{x) G 7{n'- i n the iVth Ritz approximation, we search over 
a set of controllers that is parametrized by x 6 R , in the same way that the family 
of PID controllers is parametrized by the vector of gains, which is in R 3 . But the 
parametrization Kn{x) has a very special property: it preserves the geometry of the 
underlying controller design problem. If a design specification or functional is closed- 
loop convex or affine, so is the resulting constraint on or function of x 6 R . This 
is not true of more general parametrizations of controllers, e.g., the PID controllers. 
The controller architecture that corresponds to the parametrization Kn{x) is shown 
in figure 15.1. 
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Figure 15.1 The Ritz approximation (15.7-15.8) corresponds to a 
parametrized controller Kn(x) that consists of two parts: a nominal con- 
troller -fTnom, and a stable transfer matrix Q that is a linear combination of 
the fixed transfer matrices Q\, . . . , Qn- See also section 7.3 and figure 7.5. 



15.2 An Example with an Analytic Solution 

In this section we demonstrate the Ritz method on a problem that has an analytic 
solution. This allows us to see how closely the solutions of the approximations agree 
with the exact, known solution. 



15.2.1 The Problem and Solution 

The example we will study is the standard plant from section 2.4. We consider the 
RMS actuator effort and RMS regulation functionals described in sections 11.2.3 
and 11.3.2: 



RMS(j/ p ) = 4>rms^p(H) = 



i ii*? ii . . o \ 1/2 

rms_ypV-",/ — V.ll-"12"i sensor || 2 + ||-"13"proc H2 J 

RMS( U ) = </> rms _u(#) = (||^22^sensor||2 + ll#23 ^proc H2) ^ . 
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We consider the specific problem: 

min </>rms_u(-H"). (15.9) 

4>rms^p(H) < 0.1 

The solution, (/>r ms _ u = 0.0397, can be found by solving an LQG problem with 
weights determined by the algorithm given in section 14.5. 

15.2.2 Four Ritz Approximations 

We will demonstrate four different Ritz approximations, by considering two different 
parametrizations (i.e., T\, T^, and T3 in (15.7)) and two different sequences of stable 
Q's. 

The parametrizations are given by the formulas in section 7.4 using the two 
estimated-state-feedback controllers K ( a ) and K ( d ) from section 2.4 (see the Notes 
and References for more details). The sequences of stable transfer matrices we 
consider are 



Q* = { JTl ), ^ = UnJ' i = 1 >- (15 - 10) 

We will denote the four resulting Ritz approximations as 

(K^,Q), (K&,Q), (KW,Q), (KW,Q). (15.11) 

The resulting finite-dimensional Ritz approximations of the problem (15.9) turn 
out to have a simple form: both the objective and the constraint function are con- 
vex quadratic (with linear and constant terms) in x. These problems were solved 
exactly using a special algorithm for such problems; see the Notes and References 
at the end of this chapter. The performance of these four approximations is plot- 
ted in figure 15.2 along with a dotted line that shows the exact optimum, 0.0397. 
Figure 15.3 shows the same data on a more detailed scale. 

15.3 An Example with no Analytic Solution 

We now consider a simple modification to the problem (15.9) considered in the 
previous section: we add a constraint on the overshoot of the step response from 
the reference input r to the plant output y p , i.e., 

min </>rms_u(-H"). (15.12) 

<t>rms^p{H) < 0.1 

(t>os{H 13 ) < 0.1 

Unlike (15.9), no analytic solution to (15.12) is known. For comparison, the optimal 
design for the problem (15.9) has a step response overshoot of 39.7%. 
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Figure 15.2 The optimum value of the finite-dimensional inner approxi- 
mation of the optimization problem (15.9) versus the number of terms TV 
for the four different Ritz approximations (15.11). The dotted line shows 
the exact solution, RMS(u) = 0.0397. 
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Figure 15.3 Figure 15.2 is re-plotted to show the convergence of the finite- 
dimensional inner approximations to the exact solution, RMS(m) = 0.0397. 
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The same four Ritz approximations (15.11) were formed for the problem (15.12), 
and the ellipsoid algorithm was used to solve them. The performance of the approx- 
imations is plotted in figure 15.4, which the reader should compare to figure 15.2. 
The minimum objective values for the Ritz approximations appear to be converging 
to 0.058, whereas without the step response overshoot specification, the minimum 
objective is 0.0397. We can interpret the difference between these two numbers as 
the cost of reducing the step response overshoot from 39.7% to 10%. 




N 
Figure 15.4 The optimum value of the finite-dimensional inner approxi- 
mation of the optimization problem (15.12) versus the number of terms TV 
for the four different Ritz approximations (15.11). No analytic solution to 
this problem is known. The dotted line shows the optimum value of RMS(m) 
without the step response overshoot specification. 



15.3.1 Ellipsoid Algorithm Performance 

The finite-dimensional optimization problems produced by the Ritz approximations 
of (15.12) are much more substantial than any of the numerical example problems 
we encountered in chapter 14, which were limited to two variables (so we could plot 
the progress of the algorithms). It is therefore worthwhile to briefly describe how 
the ellipsoid algorithms performed on the N = 20 (K (*\ Q) Ritz approximation, as 
an example of a demanding numerical optimization problem. 

The basic ellipsoid algorithm was initialized with A\ = 50007, so the initial 
ellipsoid was a sphere with radius 70.7. All iterates were well inside this initial 
ellipsoid. Moreover, increasing the radius had no effect on the final solution (and, 
indeed, only a small effect on the total computation time). 
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The algorithm took 34 iterations to find a feasible point, and 4259 iterations for 
the maximum relative error to fall below 0.1%. The maximum and actual relative 
errors versus iteration number are shown in figure 15.5. The relative constraint 
violation and the normalized objective function value versus iteration number are 
shown in figure 15.6. From these figures it can be seen that the ellipsoid algo- 
rithm produces designs that are within a few percent of optimal within about 1500 
iterations. 
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Figure 15.5 The ellipsoid algorithm maximum and actual relative errors 
versus iteration number, k, for the solution of the TV = 20 (K *■*', Q) Ritz 
approximation of (15.12). After 34 iterations a feasible point has been found, 
and after 4259 iterations the objective has been computed to a maximum 
relative error of 0.1%. Note the similarity to figure 14.15, which shows a 
similar plot for a much simpler, two variable, problem. 



For comparison, we solved the same problem using the ellipsoid algorithm with 
deep-cuts for both objective and constraint cuts, with the same initial ellipsoid. 
A few more iterations were required to find a feasible point (53), and somewhat 
fewer iterations were required to find the optimum to within 0.1% (2620). Its 
performance is shown in figure 15.7. The objective and constraint function values 
versus iteration number for the deep-cut ellipsoid algorithm were similar to the basic 
ellipsoid algorithm. 

The number of iterations required to find the optimum to within a guaranteed 
maximum relative error of 0.1% for each N (for the {K^ a \Q) Ritz approximation) 
is shown in figure 15.8 for both the regular and deep-cut ellipsoid algorithms. (For 
N < 4 the step response overshoot constraint was infeasible in the [K^ a \Q) Ritz 
approximation. ) 
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Figure 15.6 The ellipsoid algorithm constraint and objective functionals 
versus iteration number, k, for the solution of the TV = 20 (K *■*', Q) Ritz 
approximation of (15.12). For the constraints, the percentage violation is 
plotted. For the objective, rms _ u , the percentage difference between the 
current and final objective value, 0* ms _ u , i s plotted. It is hard to distinguish 
the plots for the constraints, but the important point here is the "steady, 
stochastic" nature of the convergence. Note that within 1500 iterations, 
designs were obtained with objective values and contraints within a few 
percent of optimal and feasible, repsectively. 



360 



Chapter 15 Solving the Controller Design Problem 



100 



10 



o 



0.1 



\ 






: 


\ 




I maximum 
}/ relative error 


; 


: 


actual 
relative error 




: 











500 



1000 



1500 

k 



2000 



2500 



3000 



Figure 15.7 The deep-cut ellipsoid algorithm maximum and actual rel- 
ative errors versus iteration number, k, for the solution of the N = 20 
(K^' , Q) Ritz approximation of (15.12). After 53 iterations a feasible point 
has been found, and after 2620 iterations the objective has been computed 
to a maximum relative error of 0.1%. Note the similarity to figure 15.5. 
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Figure 15.8 The number of ellipsoid iterations required to compute upper 
bounds of RMS(m) to within 0.1%, versus the number of terms TV, is shown 
for the Ritz approximation (K '"', Q). The upper curve shows the iterations 
for the regular ellipsoid algorithm. The lower curve shows the iterations 
for the deep-cut ellipsoid algorithm, using deep-cuts for both objective and 
constraint cuts. 
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15.4 An Outer Approximation via Duality 

Figure 15.4 suggests that the minimum value of the problem (15.12) is close to 
0.058, since several different Ritz approximations appear to be converging to this 
value, at least over the small range of N plotted. The value 0.058 could reasonably 
be accepted on this basis alone. 

To further strengthen the plausibility of this conclusion, we could appeal to some 
convergence theorem (one is given in the Notes and References). But even knowing 
that each of the four curves shown in figure 15.4 converges to the exact value of 
the problem (15.12), we can only assert with certainty that the optimum value lies 
between 0.0397 (the true optimum of (15.9)) and 0.0585 (the lowest objective value 
computed with a Ritz approximation). 

This is a problem of stopping criterion, which we discussed in chapter 14 in the 
context of optimization algorithms; accepting 0.058 as the optimum value of (15.12) 
corresponds to a (quite reasonable) heuristic stopping criterion. Just as in chap- 
ter 14, however, a stopping criterion that is based on a known lower bound may be 
worth the extra computation involved. 

In this section we describe a method for computing lower bounds on the value 
of (15.12), by forming an appropriate dual problem. Unlike the stopping criteria for 
the algorithms in chapter 14, which involve little or no additional computation, the 
lower bound computations that we will describe require the solution of an auxiliary 
minimum H2 norm problem. 

The dual function introduced in section 3.6.2 produces lower bounds on the so- 
lution of (15.12), but is not useful here since we cannot exactly evaluate the dual 
function, except by using the same approximations that we use to approximately 
solve (15.12). To form a dual problem that we can solve requires some manipulation 
of the problem (15.12) and a generalization of the dual function described in sec- 
tion 3.6.2. The generalization is easily described informally, but a careful treatment 
is beyond the scope of this book; see the Notes and References at the end of this 
chapter. 

We replace the RMS actuator effort objective and the RMS regulation constraint 
in (15.12) with the corresponding variance objective and constraint, i.e., we square 
the objective and constraint functionals </> rm s_u and (/> rm s^yp- Instead of considering 
the step response overshoot constraint in (15.12) as a single functional inequality, 
we will view it as a family of constraints on the step response: one constraint for 
each t > 0, that requires the step response at time t not exceed 1.1: 

min c/>rms_u(#) 2 (15.13) 

</>rms^p(#) 2 < 0.1 2 
ste P'*(#) < 1.1, t > 

where ^> ste P'* is the affine functional that evaluates the step response of the 1,3 entry 
at time t: 

<f>^(H) = s 13 (t). 
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In this transformed optimization problem, the objective is quadratic and the con- 
straints consist of one quadratic constraint along with a family of affine constraints 
that is parametrized by t 6 R+. 

This suggests the following generalized dual functional for the problem (15.13): 

7p{X u , Xy, A s ) 

= min ( \ u <t>r m s_u{H) 2 + A y <f> ims _ yp {H) 2 + f X t {t)4>' te P' i {H)dt), (15.14) 

where the "weights" now consist of the positive numbers X u and A^, along with 
the function X s : R + — > R + . So in equation (3.10), we have replaced the weighted 
sum of (a finite number of) constraint functionals with a weighted integral over the 
family of constraint functionals that appears in (15.13). 

It is easily established that — ip is a convex functional of (X u ,X y ,X s ) and that 
whenever X y > and X s (t) > for all t > 0, we have 



V>( 



y»CO 

l,X y ,X s )-0.1 2 X y - / l.lX s {t) dt < a pii 
Jo 



where a pr i is the optimum value of (15.12). Thus, computing tp(l,X y ,X s ) yields a 
lower bound on the optimum of (15.12). 

The convex duality principle (equations (6.11-6.12) of section 6.6) suggests that 
we actually have 

a pri = max (ip(l,X y ,X s )-0.1 2 X y - l.lX s (t)dt), (15.15) 

X y > 0, \,(t) > ^ Jo ' 

which in fact is true. So the optimization problem on the right-hand side of (15.15) 
can be considered a dual of (15.12). 

We can compute ^(1, A^, X s ), provided we have a state-space realization with 
impulse response X s (t). The objective in (15.14) is an LQG objective, with the 
addition of the integral term, which is an affine functional of H. By completing 
the square, it can be recast as an H 2 -optimal controller problem, and solved by (an 
extension of) the method described in section 12.2. Since we can find a minimizer 
H\, for the problem (15.14), we can evaluate a subgradient for —t[>: 

/»oo 

4> sg {X u ,X y ,X s ) = -X u (f) ims _ u (Hl) 2 - A 2/ (/) rms ^y P (i?^) 2 - / X s (t)(f) step,t (Hl)dt 

Jo 

(c.f. section 13.4.8). 

By applying a Ritz approximation to the infinite-dimensional optimization prob- 
lem (15.15), we obtain lower bounds on the right-hand, and hence left-hand sides 
of (15.15). 
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To demonstrate this, we use two Ritz sequences for X s given by the transfer 
functions 



A = 0, Ai{s) = 



s + 2 



Ai(s) = 



s + 4 



< = !,. 



(15.16) 



We shall denote these two Ritz approximations A, A. The solution of the finite- 
dimensional inner approximation of the dual problem (15.15) is shown in figure 15.9 
for various values of N. Each curve gives lower bounds on the solution of (15.12). 
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Figure 15.9 The optimum value of the finite-dimensional inner approxi- 
mation of the dual problem (15.15) versus the number of terms TV for the two 
different Ritz approximations (15.16). The dotted line shows the optimum 
value of RMS(m) without the step response overshoot specification. 

The upper bounds from figure 15.4 and lower bounds from figure 15.9 are shown 
together in figure 15.10. The exact solution of (15.12) is known to lie between the 
dashed lines; this band is shown in figure 15.11 on a larger scale for clarity. 

The best lower bound on the value of (15.12) is 0.0576, corresponding to the 
N = 20 {A} Ritz approximation of (15.15). The best upper bound on the value 
of (15.12) is 0.0585, corresponding to the N = 20 (K ( a \ Q) Ritz approximation 
of (15.12). Thus we can state with certainty that 

.0576 < min ^ rms _u(-H") < -0585. 

^rms^p(-ff) < 0.1 

<t>os{H lz ) < 0.1 

We now know that 0.058 is within 0.0005 (i.e., 1%) of the minimum value of the 
problem (15.12); the plots in figure 15.4 only strongly hint that this is so. 
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TV 

Figure 15.10 The curves from figures 15.4 and 15.9 are shown. The solu- 
tion of each finite-dimensional inner approximation of (15.12) gives an upper 
bound of the exact solution. The solution of each finite-dimensional inner 
approximation of the dual problem (15.15) gives a lower bound of the exact 
solution. The exact solution is therefore known to lie between the dashed 
lines. 



0.061 




Figure 15.11 Figure 15.10 is shown in greater detail. The exact solution 
of (15.12) is known to lie inside the shaded band. 



366 Chapter 15 Solving the Controller Design Problem 

We should warn the reader that we do not know how to form a solvable dual 
problem for the most general convex controller design problem; see the Notes and 
References. 

15.5 Some Tradeoff Curves 

In the previous three sections we studied two specific optimization problems, and 
the performance of several approximate solution methods. In this section we con- 
sider some related two-parameter families of design specifications and the same 
approximate solution methods, concentrating on the effect of the approximations 
on the computed region of achievable specifications in performance space. 

15.5.1 Tradeoff for Example with Analytic Solution 

We consider the family of design specifications given by 

</>rms_ y p(#) < a, <f> IIas _u(H) < /3, (15.17) 

for the same example that we have been studying. 

Figure 15.12 shows the tradeoff curves for the (K ( a \ Q) Ritz approximations 
of (15.17), for three values of N. The regions above these curves are thus An; A 
is the region above the solid curve, which is the exact tradeoff curve. This figure 
makes clear the nomenclature "inner approximation". 

15.5.2 Tradeoff for Example with no Analytic Solution 

We now consider the family of design specifications given by 

X> (a,/3) : <f> im s_y P {H) < a, <t> ims _u(H) < P, </> os {H 13 ) < 0.1. (15.18) 

Inner and outer approximations for the tradeoff curve for (15.18) can be com- 
puted using Ritz approximations to the primal and dual problems. For example, an 
N = 6 (K^ d \Q) Ritz approximation to (15.18) shows that the specifications in the 
top right region in figure 15.13 are achievable. On the other hand, an N = 5 A Ritz 
approximation to the dual of (15.18) shows that the specifications in the bottom 
left region in figure 15.13 are unachievable. Therefore the exact tradeoff curve is 
known to lie in the shaded region in figure 15.14. 

The inner and outer approximations shown in figure 15.14 can be improved by 
solving larger finite-dimensional approximations to (15.18) and its dual, as shown 
in figure 15.15. 

Figure 15.15 shows the tradeoff curve for (15.17), for comparison. The gap 
between this curve and the shaded region shows the cost of the additional step 
response specification. (The reader should compare these curves with those of 
figure 14.18 in section 14.5, which shows the cost of the additional specification 
RMS(i)p) < 0.04 on the family of design specifications (15.17).) 
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Figure 15.12 The exact tradeoff between RMS(2/ P ) and RMS(u) for the 
problem (15.17) can be computed using LQG theory. The tradeoff curves 
computed using three (K^ a ',Q) Ritz inner approximations are also shown. 
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Figure 15.13 With the specification OS (-Hi3) < 0.1, specifications on 
0rms^rp and 0rms_u in the upper shaded region are shown to be achievable 
by solving an TV = 6 (_K"' d \ Q) finite-dimensional approximation of (15.18). 
Specifications in the lower shaded region are shown to be unachievable by 
solving an TV = 5 A finite-dimensional approximation of the dual of (15.18). 



368 



Chapter 15 Solving the Controller Design Problem 



b. 

CO 



0.25 



0.2 



0.15 



0.1 



0.05 












A, TV = 5 



0.05 



0.2 



0.25 



0.1 0.15 

RJVIS(u) 
Figure 15.14 From figure 15.13 we can conclude that the exact tradeoff 
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Figure 15.15 With larger finite-dimensional approximations to (15.18) and 
its dual, the bounds on the achievable limit of performance are substantially 
improved. The dashed curve is the boundary of the region of achievable 
performance without the step response overshoot constraint. 
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Notes and References 

Ritz Approximations 

The term comes from Rayliegh-Ritz approximations to infinite-dimensional eigenvalue 
problems; see, e.g., Courant and Hilbert [CH53, p175]. The topic is treated in, e.g., 
section 3.7 of Daniel [Dan71]. A very clear discussion of Ritz methods, and their conver- 
gence properties, appears in sections 8 and 9 of the paper by Levitin and Polyak [LP66]. 

Proof that the Example Ritz Approximations Converge 

It is often possible to prove that a Ritz approximation "works", i.e., that An — * A (in 
some sense) as 7V — ► oo. As an example, we give a complete proof that for the four Ritz 
approximations (15.11) of the controller design problem with objectives RMS actuator 
effort, RMS regulation, and step response overshoot, we have 



[JAnDA. (15.19) 



This means that every achievable specification that is not on the boundary (i.e., not Pareto 
optimal) can be achieved by a Ritz approximation with large enough N. 

We first express the problem in terms of the parameter Q in the free parameter represen- 
tation Of Stable : 

MQ) = 0rm S _ yP (Tj +T 2 QT 3 ) = ||Gi + GiQ||a, 

MQ) = 0rms_u(Tl +T2QT3) = \\G 2 + G2QW2, 

MQ) = 0«([1 0](Ti + T 2 QT 3 )[0 1] T ) = ||G 3 + G 3 Q||pk^tep - 1, 

where the G% depend on submatrices of T\, and the G% depend on the appropriate sub- 
matrices of Ti and T3. (These transfer matrices incorporate the constant power spectral 
densities of the sensor and process noises, and combine the T 2 and T3 parts since Q is 
scalar.) These transfer matrices are stable and rational. We also have Gs(0) = 0, since Po 
has a pole at s = 0. 

We have 

A = {a I tpi(Q) < at, i = 1, 2, 3, for some Q 6 H2} , 



An = < a 



ipi(Q) < at, i = 1, 2, 3, for some Q = \J x t Qi 



(H2 is the Hilbert space of Laplace transforms of square integrable functions from R + into 

R). 

We now observe that ipi, ip2, and ip3 are continuous functionals on H2; in fact, there is an 
M < 00 such that 

\MQ)~MQ)\ <M\\Q-Q\\ 2 , i= 1,2,3. (15.20) 
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This follows from the inequalities 

\MQ)-MQ)\ < \\Gi\\oo\\Q - Qh, 

\MQ)-MQ)\ <l|Ga||oo||Q-Q||a, 
\MQ)-MQ)\ <l|G3/«||a||Q-Q||a. 

(The first two are obvious; the last uses the general fact that ||AB|| p k_step < ||j4/s||2||S||2, 
which follows from the Cauchy-Schwarz inequality.) 

We now observe that the sequence (s + ct)~ z , i = 1,2,..., where a > 0, is complete, i.e., 
has dense span in H2. In fact, if this sequence is orthonormalized we have the Laplace 
transforms of the Laguerre functions on R + . 

We can now prove (15.19). Suppose a € A and e > 0. Since a € A, there is a Q* £ H2 
such that ifti(Q*) < a,i, i = 1,2,3. Using completeness of the Qi's, find TV and x* € TL N 
such that 



\\Q* - Q' N h < e/M, 



where 



Qiv = y^SjQi 



By (15.20), Vi(Qjv) <a,i + e,i = 1,2, 3. This proves (15.19). 

The Example Ritz Approximations 

We used the state-space parametrization in section 7.4, with 

P* td (s) = C(sl - Ay'B, 
where 



A = 



-10 





" 




' 1 


1 








, B = 








1 


. 




. 



c = [ -1 10 ] . 



The controllers K^ a ' and K^ h > are estimated-state-feedback controllers with 

K { s ^ = [ -6.00000 5.25000 2.50000 ] , 



-(«) _ 



5.20000 
-2.08000 
-0.80800 



r(d) _ 
-^est — 



0.00000 
-3.16228 
-1.11150 



K { J^ = [ 1.42758 10.29483 2.44949 ] 
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Problem with Analytic Solution 

With the Ritz approximations, the problem has a single convex quadratic constraint, and 
a convex quadratic objective, which are found by solving appropriate Lyapunov equations. 
The resulting optimization problems are solved using a standard method that is described 
in, e.g., Golub and Van Loan [GL89, P564-566]. Of course an ellipsoid or cutting-plane 
algorithm could also be used. 

Dual Outer Approximation for Linear Controller Design 

General duality in infinite-dimensional optimization problems is treated in the book by 
Rockafellar [Roc74], which also has a complete reference list of other sources covering this 
material in detail. See also the book by Anderson and Nash [AN87] and the paper by 
Reiland [Rei80]. 

As far as we know, the idea of forming finite-dimensional outer approximations to a convex 
linear controller design problem, by Ritz approximation of an appropriate dual problem, 
is new. 

The method can be applied when the functionals are quadratic {e.g., weighted H2 norms 
of submatrices of H), or involve envelope constraints on time domain responses. We do not 
know how to form a solvable dual problem of a general convex controller design problem. 
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Chapter 16 

Discussion and Conclusions 



We summarize the main points that we have tried to make, and we discuss some 
applications and extensions of the methods described in this book, as well as some 
history of the main ideas. 



16.1 The Main Points 

• An explicit framework. A sensible formulation of the controller design problem 
is possible only by considering simultaneously all of the closed-loop transfer 
functions of interest, i.e., the closed-loop transfer matrix H, which should 
include every closed-loop transfer function necessary to evaluate a candidate 
design. 

• Convexity of many specifications. The set of transfer matrices that meet a 
design specification often has simple geometry — affine or convex. In many 
other cases it is possible to form convex inner (conservative) approximations. 

• Effectiveness of convex optimization. Many controller design problems can 
be cast as convex optimization problems, and therefore can be "efficiently" 
solved. 

• Numerical methods for performance limits. The methods described in this 
book can be used both to design controllers (via the primal problem) and to 
find the limits of performance (via the dual problem). 

16.2 Control Engineering Revisited 

In this section we return to the broader topic of control engineering. Some of the 
major tasks of control engineering are shown in figure 16.1: 
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• The system to be controlled, along with its sensors and actuators, is modeled 
as the plant P. 

• Vague goals for the behavior of the closed-loop system are formulated as a set 
of design specifications (chapters 8-10). 

• If the plant is LTI and the specifications are closed-loop convex, the resulting 
feasibility problem can be solved (chapters 13-15). 

• If the specifications are achievable, the designer will check that the design is 
satisfactory, perhaps by extensive simulation with a detailed (probably non- 
linear) model of the system. 




Figure 16.1 A partial flowchart of the control engineer's tasks. 

One design will involve many iterations of the steps shown in figure 16.1. We 
now discuss some possible design iterations. 

Modifying the Specifications 

The specifications are weakened if they are infeasible, and possibly tightened if they 
are feasible, as shown in figure 16.2. This iteration may take the form of a search 
over Pareto optimal designs (chapter 3). 




Figure 16.2 Based on the outcome of the feasibility problem, the designer 
may decide to modify (e.g., tighten or weaken) some of the specifications. 



16.2 Control Engineering Revisited 375 

Modifying the Control Configuration 

Based on the outcome of the feasibility problem, the designer may modify the 
choice and placement of the sensors and actuators, as shown in figure 16.3. If the 
specifications are feasible, the designer might remove actuators and sensors to see 
if the specifications are still feasible; if the specifications are infeasible, the designer 
may add or relocate actuators and sensors until the specifications become achievable. 
The value of knowing that a given set of design specifications cannot be achieved 
with a given configuration should be clear. 



Figure 16.3 Based on the outcome of the feasibility problem, the designer 
may decide to add or remove sensors or actuators. 



These iterations can take a form that is analogous to the iteration described 
above, in which the specifications are modified. We consider a fixed set of specifi- 
cations, and a family (which is usually finite) of candidate control configurations. 
Figure 16.4 shows fourteen possible control configurations, each of which consists of 
some selection among the two potential actuators Ai and A 2 and the three sensors 
Si, S2, an d S3. (These are the configurations that use at least one sensor, and one, 
but not both, actuators. Ai and A 2 might represent two candidate motors for a 
system that can only accommodate one.) These control configurations are partially 
ordered by inclusion; for example, -A1S1 consists of deleting the sensor S3 from the 
configuration .A1S1S3. 

These different control configurations correspond to different plants, and there- 
fore different feasibility problems, some of which may be feasible, and others in- 
feasible. One possible outcome is shown in figure 16.5: nine of the configurations 
result in the specifications being feasible, and five of the configurations result in 
the specifications being infeasible. In the iteration described above, the designer 
could choose among the achievable specifications; here, the designer can choose 
among the control configurations that result in the design specification being feasi- 
ble. Continuing the analogy, we might say that .A1S1S2 is a Pareto optimal control 
configuration, on the boundary between feasibility and infeasibility. 
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Figure 16.4 The possible actuator and sensor configurations, with the 
partial ordering induced by achievability of a set of specifications. 



V achievable 





T> unachievable 

Figure 16.5 The actuator and sensor configurations that can meet the 
specification T>. 



Modifying the Plant Model and Specifications 

After choosing an achievable set of design specifications, the design is verified: does 
the controller, designed on the basis of the LTI model P and the design specifica- 
tions V, achieve the original goals when connected in the real closed-loop system? 
If the answer is no, the plant P and design specifications V have failed to accu- 
rately represent the original system and goals, and must be modified, as shown in 
figure 16.6. 

Perhaps some unstated goals were not included in the design specifications. For 
example, if some critical signal is too big in the closed-loop system, it should be 
added to the regulated variables signal, and suitable specifications added to V, to 
constrain the its size. 
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Figure 16.6 If the outcome of the feasibility problem is inconsistent with 
designer's criteria then the plant and specifications must be modified to 
capture the designer's intent. 



As a specific example, the controller designed in the rise time versus undershoot 
tradeoff example in section 12.4, would probably be unsatisfactory, since our design 
specifications did not constrain actuator effort. This unsatisfactory aspect of the 
design would not be apparent from the specifications — indeed, our design cannot 
be greatly improved in terms of rise time or undershoot. The excessive actuator 
effort would become apparent during design verification, however. The solution, of 
course, is to add an appropriate specification that limits actuator effort. 

An achievable design might also be unsatisfactory because the LTI plant P is 
not a sufficiently good model of the system to be controlled. Constraining various 
signals to be smaller may improve the accuracy with which the system can be 
modeled by an LTI P; adding appropriate robustness specifications (chapter 10) 
may also help. 



16.3 Some History of the Main Ideas 

16.3.1 Truxal's Closed-Loop Design Method 

The idea of first designing the closed-loop system and then determining the con- 
troller required to achieve this closed-loop system is at least forty years old. An 
explicit presentation of a such a method appears in Truxal's 1950 Ph.D. the- 
sis [Tru50], and chapter 5 of Truxal's 1955 book, Automatic Feedback Control 
System Synthesis, in which we find [Tru55, p279]: 

Guillemin in 1947 proposed that the synthesis of feedback control sys- 
tems take the form . . . 

1. The closed- loop transfer function is determined from the specifica- 
tions. 

2. The corresponding open-loop transfer function is found. 

3. The appropriate compensation networks are synthesized. 
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Truxal cites his Ph.D. thesis and a 1951 article by Aaron [Aar51]. 

On the difference between classical controller synthesis and the method he pro- 
poses, he states ([Tru55, P278-279]): 

The word synthesis rigorously implies a logical procedure for the transi- 
tion from specifications to system. In pure synthesis, the designer is able 
to take the specifications and in a straightforward path proceed to the 
final system. In this sense, neither the conventional methods of servo 
design nor the root locus method is pure synthesis, for in each case the 
designer attempts to modify and to build up the open-loop system until 
he has reached a point where the system, after the loop is closed, will 
be satisfactory. 

. . . [The closed-loop design] approach to the synthesis of closed-loop sys- 
tems represents a complete change in basic thinking. No longer is the 
designer working inside the loop and trying to splice things up so that 
the overall system will do the job required. On the contrary, he is now 
saying, "I have a certain job that has to be done. I will force the system 
to do it." 

So Truxal views his closed-loop controller design method as a "more logical synthesis 
pattern" (p278) than classical methods. He does not extensively justify this view, 
except to point out the simple relation between the classical error constants and the 
closed-loop transfer function (p281). (In chapter 8 we saw that the classical error 
constants are affine functionals of the closed-loop transfer matrix.) 

The closed-loop design method is described in the books [NGK57], [RF58, 
CH7], [Hor63, §5.12], and [FPW90, §5.7] (see also the Notes and References from 
chapter 7 on the interpolation conditions). 

16.3.2 Fegley's Linear and Quadratic Programming Approach 

The observation that some controller design problems can be solved by numerical 
optimization that involves closed- loop transfer functions is made in a series of papers 
starting in 1964 by Fegley and colleagues. In [Feg64] and [FH65], Fegley applies 
linear programming to the closed-loop controller design approach, incorporating 
such specifications as asymptotic tracking of a specific command signal and an 
overshoot limit. This method is extended to use quadratic programming in [PF66, 
BF68]. In [CF68] and [MF71], specifications on RMS values of signals are included. 
A summary of most of the results of Fegley and his colleagues appears in [FBB71], 
which includes examples such as a minimum variance design with a step response 
envelope constraint. This paper has the summary: 

Linear and quadratic programming are applicable ... to the design of 
control systems. The use of linear and quadratic programming fre- 
quently represents the easiest approach to an optimal solution and often 
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makes it possible to impose constraints that could not be imposed in 
other methods of solution. 

So, several important ideas in this book appear in this series of papers by Fegley 
and colleagues: designing the closed-loop system directly, noting the restrictions 
placed by the plant on the achievable closed-loop system; expressing performance 
specifications as closed- loop convex constraints; and using numerical optimization 
to solve problems that do not have an analytical solution (see the quote above). 

Several other important ideas, however, do not appear in this series of papers. 
Convexity is never mentioned as the property of the problems that makes effective 
solution possible; linear and quadratic programming are treated as useful "tools" 
which they "apply" to the controller design problem. The casual reader might 
conclude that an extension of the method to indefinite (nonconvex) quadratic pro- 
gramming is straightforward, and might allow the designer to incorporate some 
other useful specifications. This is not the case: numerically solving nonconvex 
QP's is vastly more difficult than solving convex QP's (see section 14.6.1). 

Another important idea that does not appear in the early literature on the 
closed-loop design method is that it can potentially search over all possible LTI 
controllers, whereas a classical design method (or indeed, a modern state-space 
method) searches over a restricted (but often adequate) set of LTI controllers. Fi- 
nally, this early form of the closed-loop design method is restricted to the design 
of one closed-loop transfer function, for example, from command input to system 
output. 

16.3.3 Q-Parameter Design 

The closed-loop design method was first extended to MAMS control systems (i.e., by 
considering closed- loop transfer matrices instead of a particular transfer function), 
in a series of papers by Desoer and Chen [DC81A, DC81B, CD82B, CD83] and 
Gustafson and Desoer [GD83, DG84B, DG84A, GD85]. These papers emphasize 
the design of controllers, and not the determination that a set of design specifications 
cannot be achieved by any controller. 

In his 1986 Ph. D. thesis, Salcudean [Sal86] uses the parametrization of achiev- 
able closed-loop transfer matrices described in chapter 7 to formulate the controller 
design problem as a constrained convex optimization problem. He describes many of 
the closed-loop convex specifications we encountered in chapters 8-10, and discusses 
the importance of convexity. See also the article by Polak and Salcudean [PS89]. 

The authors of this book and colleagues have developed a program called QDES, 
which is described in the article [BBB88]. The program accepts input written 
in a control specification language that allows the user to describe a discrete-time 
controller design problem in terms of many of the closed- loop convex specifications 
presented in this book. A simple method is used to approximate the controller 
design problem as a finite-dimensional linear or quadratic programming problem, 
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which is then solved. The simple organization and approximations made in QDES 
make it practical only for small problems. The paper by Oakley and Barratt [OB 90] 
describes the use of QDES to design a controller for a flexible mechanical structure. 

16.3.4 FIR Filter Design via Convex Optimization 

A relevant parallel development took place in the area of digital signal processing. 
In about 1969, several researchers observed that many finite impulse response (FIR) 
filter design problems could be cast as linear programs; see, for example, the arti- 
cles [CRR69, Rab72] or the books by Oppenheim and Schaefer [OS70, §5.6] and 
Rabiner and Gold [RG75, CH.3]. In [RG75, §3.39] we even find designs subject to 
both time and frequency domain specifications: 

Quite often one would like to impose simultaneous restrictions on both 
the time and frequency response of the filter. For example, in the design 
of lowpass filters, one would often like to limit the step response over- 
shoot or ripple, at the same time maintaining some reasonable control 
over the frequency response of the filter. Since the step response is a 
linear function of the impulse response coefficients, a linear program is 
capable of setting up constraints of the type discussed above. 

A recent article on this topic is [OKU88]. 

Like the early work on the closed-loop design method, convexity is not recognized 
as the property of the FIR filter design problem that allows efficient solution. Nor is 
it noted that the method actually computes the global optimum, i.e., if the method 
fails to design an FIR filter that meets some set of convex specifications, then the 
specifications cannot be achieved by any FIR filter (of that order). 

16.4 Some Extensions 

16.4.1 Discrete-Time Plants 

Essentially all of the material in this book applies to single-rate discrete-time plants 
and controllers, provided the obvious changes are made (e.g., redefining stability to 
mean no poles on or outside the unit disk). For a discrete-time development, there 
is a natural choice of stable transfer matrices that can be used to form the (analog 
of the) Ritz sequence (15.8) described in section 15.1: 

Qijk(z) = E ij z~ < - h ~ 1 \ l<i<n u , l<j<n y , k = 1,2, ..., 

(Eij is the matrix with a unit i, j entry, and all other entries zero), which corresponds 
to a delay of k — 1 time steps, from the jth input of Q to its ith output. Thus in the 
Ritz approximation, the entries of the transfer matrix Q are polynomials in z _1 , 
i.e., FIR filters. This approach is taken in the program QDES [BBB88]. 
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Many of the results can be extended to multi-rate plant and controllers, i.e., 
a plant in which different sensor signals are sampled at different rates, or differ- 
ent actuator signals are updated at different rates. A parametrization of stabi- 
lizing multi-rate controllers has recently been developed by Meyer [Mey90]; this 
parametrization uses a transfer matrix Q(z) that ranges over all stable transfer 
matrices that satisfy some additional convex constraints. 

16.4.2 Nonlinear Plants 

There are several heuristic methods for designing a nonlinear controller for a non- 
linear plant, based on the design of an LTI controller for an LTI plant (or a family 
of LTI controllers for a family of LTI plants); see the Notes and References for chap- 
ter 2. In the Notes and References for chapter 10, we saw a method of designing a 
nonlinear controller for a plant that has saturating actuators. These methods often 
work well in practice, but do not qualify as extensions of the methods and ideas 
described in this book, since they do not consider all possible closed-loop systems 
that can be achieved. In a few cases, however, stronger results have been obtained. 

In [DL82], Desoer and Liu have shown that for stable nonlinear plants, there is 
a parametrization of stabilizing controllers that is similar to the one described in 
section 7.2.4, provided a technical condition on P holds (incremental stability). 

For unstable nonlinear plants, however, only partial results have been obtained. 
In [DL83] and [AD84], it is shown how a family of stabilizing controllers can be 
obtained by first finding one stabilizing controller, and then applying the results 
of Desoer and Liu mentioned above. But even in the case of an LTI plant and 
controller, this "two-step compensation" approach can fail to yield all controllers 
that stabilize the plant. This approach is discussed further in the articles [DL84A, 
DL84B, DL85]. 

In a series of papers, Hammer has investigated an extension of the stable factor- 
ization theory (see the Notes and References for chapter 7) to nonlinear systems; 
see [Ham88] and the references therein. Stable factorizations of nonlinear systems 
are also discussed in Verma [Ver88]. 
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Notation and Symbols 



Basic Notation 

Notation Meaning 



(...) 

f--x- 



A 



64>{x) 



< 

6X 

arg min 
C 

c n 

pmxn 

EX 
S(z) 



Delimiters for sets, and for statement grouping in 
algorithms in chapter 14. 

Delimiters for expressions. 

A function from the set X into the set Y. 

The empty set. 

Conjunction of predicates; "and". 

A norm; see page 69. A particular norm is indicated 
with a mnemonic subscript. 

The sub differential of the functional <j> at the point x; 
see page 293. 

Equals by definition. 

Equals to first order. 

Approximately equal to (used in vague discussions). 

The inequality holds to first order. 

A first order change in X. 

A minimizer of the argument. See page 58. 

The complex numbers. 

The vector space of n-component complex vectors. 

The vector space ofmxn complex matrices. 

The expected value of the random variable X. 

The imaginary part of a complex number z. 
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inf 

3 

lim sup 

Prob(Z) 

»(z) 

R 

R+ 

R n 

■pmxn 

ai(M) 

^max(M) 

sup 

TrM 

M > 

M> 

A > 

M T 
M* 

M 1 / 2 



Notation and Symbols 



The infimum of a function or set. The reader 
unfamiliar with the notation inf can substitute min 
without ill effect. 
A square root of —1. 

The asymptotic supremum of a function; see page 72. 

The probability of the event Z. 

The real part of a complex number z. 

The real numbers. 

The nonnegative real numbers. 

The vector space of n-component real vectors. 

The vector space ofmxn real matrices. 

The ith singular value of a matrix M: the square root 
of the ith largest eigenvalue of M*M. 
The maximum singular value of a matrix M: the 
square root of the largest eigenvalue of M*M. 

The supremum of a function or set. The reader 

unfamiliar with the notation sup can substitute max 

without ill effect. 

The trace of a matrix M: the sum of its entries on the 

diagonal. 

The n x n complex matrix M is positive semidefinite, 

i.e., z*Mz > for all z g C n . 

The n x n complex matrix M is positive definite, i.e., 

z*Mz > for all nonzero z 6 C n . 

The n-component real-valued vector A has 

nonnegative entries, i.e., A € R". 

The transpose of a matrix or transfer matrix M. 

The complex conjugate transpose of a matrix or 
transfer matrix M. 

A symmetric square root of a matrix M = M* > 0, 
i.e., M 1 / 2 ^" 1 / 2 =M. 
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Global Symbols 

Symbol Meaning Page 

<f> A function from the space of transfer matrices to real 53 

numbers, i.e., a functional on 7i. A particular 
function is indicated with a mnemonic subscript. 

<p The restriction of a functional <j> to a 252 

finite-dimensional domain, i.e., a function from R n to 
R. A particular function is indicated with a 
mnemonic subscript. 

V A design specification: a predicate or boolean function 47 

on 7i. A particular design specification is indicated 
with a mnemonic subscript. 

H The closed-loop transfer matrix from w to z. 33 

H a b The closed-loop transfer matrix from the signal b to 32 

the signal a. 

Ji The set of all n z x n w transfer matrices. A particular 48 

subset of Ji [i.e., a design specification) is indicated 
with a mnemonic subscript. 

K The transfer matrix of the controller. 32 

L The classical loop gain, L = P$K. 36 

n w The number of exogenous inputs, i.e., the size of w. 26 

n u The number of actuator inputs, i.e., the size of u. 26 

n z The number of regulated variables, i.e., the size of z. 26 

n y The number of sensed outputs, i.e., the size of y. 26 

P The transfer matrix of the plant. 31 

P The transfer matrix of a classical plant, which is 34 

usually one part of the plant model P. 

S The classical sensitivity transfer function or matrix. 36, 41 

T The classical I/O transfer function or matrix. 36, 41 

w Exogenous input signal vector. 25 

u Actuator input signal vector. 25 

z Regulated output signal vector. 26 

y Sensed output signal vector. 26 
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Other Symbols 

Symbol Meaning 



Page 



A 
A 

\\x\ 

u 

u 

u 

u 

u 

u 

H 

H 

H 

H 

H 

H 
H 

H 

H 

A 



Ap,B w ,B u , 

^ zi ^yi D zwi 
D zui Uywi Uyu 

CF(u) 

Dz(-) 

I 7 (ff) 

^proc 
^sensor 



rms 

ssoo 
2 

oo 

oo,a 
hankel 
pk_step 

pk_gn 

rms, w 

rms_gn 



A feedback perturbation. 221 

A set of feedback perturbations. 221 

The Euclidean norm of a vector x € R n or x G C n , 70 

i.e., y/x*x. 

The Li norm of the signal u. 81 

The L 2 norm of the signal u. 80 

The peak magnitude norm of the signal u. 70 

The average-absolute value norm of the signal u. 74 

The RMS norm of the signal u. 72 

The steady-state peak magnitude norm of the signal u. 72 

The H2 norm of the transfer function H. 96, 110 

The Hqo norm of the transfer function H. 112, 112 

The a-shifted H^ norm of the transfer function H. 100 

The Hankel norm of the transfer function H. 103 

The peak of the step response of the transfer function 95 

H. 

The peak gain of the transfer function H. 97, 111 

The RMS response of the transfer function H when 96 
driven by the stochastic signal w. 

The RMS gain of the transfer function H, equal to its 99, 112 

Hqo norm. 

A worst case norm of the transfer function H. 97 

The region of achievable specifications in performance 139 
space. 

The matrices in a state-space representation of the 43 
plant. 

The crest factor of a signal u. 91 

The dead-zone function. 230 

The 7-entropy of the transfer function H. 113 

A process noise, often actuator-referred. 35 

A sensor noise. 35 
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F u (a) The amplitude distribution function of the signal u. 76 

h(t) The impulse response of the transfer function H at 96 

time t. 
Jj( a ) ; H( h \ The closed- loop transfer matrices from w to z achieved 42 

jj(e) ) jj{d) by the four controllers K^ a \ K^ b \ K^ c \ K^ in our 

standard example system. 

K^ a \ K^ h \ The four controllers in our standard example system. 42 

jj-W jj-(d) 

T 3 A perturbed plant set. 211 

Pg td The transfer function of our standard example 41 

classical plant. 
p, q Auxiliary inputs and outputs used in the perturbation 221 

feedback form. 
s Used for both complex frequency, s = a + jw, and the 

step response of a transfer function or matrix 

(although not usually in the same equation). 

Sat(-) The saturation function. 220 

sgn(-) The sign function. 97 

Ti, T2, T3 Stable transfer matrices used in the free parameter 156 

representation of achievable closed-loop transfer 
matrices. 

• A submatrix or entry of H not relevant to the current 172 

discussion. 

<j>* The minimum value of the function <f). 311 

x* A minimizing argument of the function <j), i.e., 311 

4? = 4>{x'). 

Tv(/) The total variation of the function /. 98 
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List of Acronyms 



Acronym Meaning 

1-DOF One Degree- Of- Freedom 

2-DOF Two Degree- Of-Freedom 

ARE Algebraic Riccati Equation 

FIR Finite Impulse Response 

I/O Input /Output 

LQG Linear Quadratic Gaussian 

LQR Linear Quadratic Regulator 

LTI Linear Time-Invariant 

MAMS Multiple-Actuator, Multiple- Sensor 

MIMO Multiple-Input, Multiple- Output 

PID Proportional plus Integral plus Derivative 

QP Quadratic Program 

RMS Root-Mean- Square 

SASS Single-Actuator, Single-Sensor 

SISO Single-Input, Single-Output 



Page 

37 

39 

122 

380 

38 

278 

275 

29 

40 

110 

5 

345 

72 

34 

95 
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Process noise 

actuator-referred, 35 
Programmable logic controllers, 2, 19 

Q 

QDES, 379 

Q-parametrization, 162, 169, 353, 379, 

381 
Quadratic program, 345, 378, 380 
Quantization error, 70 
Quasiconcave, 189 

Quasiconvex functional, 132, 176, 189, 
296 

relative overshoot, undershoot, 175 
Quasigradient, 296 

settling time, 302 

R 

Readability, 147 
Real-valued signal, 2 
Reference signal, 28, 34 
Regulated output, 26 
Regulation, 6, 12, 187, 254 

bandwidth, 189 

RMS limit, 188 
Regulator 

classical, 34, 172 
Relative 

degree, 156 

overshoot, 173 

tolerance, 317 
Resource consumption, 81 
Response time, 177 

functional, 177 

generalized, 189 
Riccati equation, 92, 122, 277, 280, 283 
Rise time, 175 

unstable zero, 283 
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Ritz 

approximation, 291, 352, 369 
RMS 

gain norm, 98, 112 

noise response, 95, 110 

signal norm, 72 

specification, 281, 332 

vector signal norm, 86 
Robustness, 6 

bandwidth limit, 234 

definition for specification, 211 

specification, 7, 210, 262 
Robust performance, 239, 246, 266 
Robust stability, 212, 264 

classical Nyquist interpretation, 244 

norm-bound specification for, 231 

versus differential sensitivity, 210, 233, 
244 

via Lyapunov function, 245 
Root locus, 11, 18, 65, 378 
Root-mean-square value, 72 



Saturation, 88, 108, 219, 229, 239, 246 

actuator, 190, 229, 239 
Saturator, 220 
Scaling, 88, 235 
Schur form, 123, 277 
Seminorm, 92 
Sensible specification, 154, 169, 277, 285, 

287 
Sensitivity 

comparison, 203 

complementary transfer function, 36 

logarithmic, 197, 260 

matrix, 41 

output-referred matrix, 202 

step response, 205, 261 

transfer function, 36, 196 

versus robustness, 210, 233, 244 
Sensor, 1 

boolean, 2 

dynamics, 213 

integrated, 9, 18 

many high quality, 10 

multiple, 40 

noise, 35, 153 

output, 26 

placement, 3 

selection, 3, 375 



Set 

affine, 128 

boolean algebra of subsets, 127 

convex, 128 

representing design specification, 127 

supporting hyperplane, 298 
Settling time, 175, 250 

quasigradient, 302 
Shifted Hoc norm, 136, 138 
Side information, 26, 35, 152, 285, 376 
Signal 

amplitude distribution of, 76 

average-absolute value norm, 74 

average-absolute value vector norm, 
87 

bandwidth, 118, 184 

comparing norms of, 89 

continuous, 3 

continuous-time, 29 

crest factor, 91 

discrete- time, 3, 29 

ITAE norm, 84 

norm of, 69 

peak norm, 70 

peak vector norm, 86 

real-valued, 2 

RMS norm, 72 

size of, 69 

stochastic norm of, 75 

total area, 81 

total energy, 80 

transient, 72, 80, 92 

unknown-but-bounded, 70, 184 

vector norm of, 86 

vector RMS norm, 86 

vector total area, 88 

vector total energy, 88 
Simulation, 6, 10 
Singing, 170 
Singular 

perturbation, 217, 244 

structured value, 245 

value, 110, 112, 124, 304 

value plot, 110 
Size 

of a linear system, 93 

of a signal, 69 
Slew rate, 82, 181, 185 

limit, 97 
Small gain 

method, 210, 231, 244 
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method state-space connection, 245 

theorem, 117, 231 
Snap, crackle and pop norms, 83 
Software 

controller design, 19 
Specification 

actuator effort, 12, 190, 256, 281 

circle theorem, 245 

closed-loop convex, 135, 245 

controller order, 150 

differential sensitivity, 195 

disturbance rejection, 187 

functional equality, 131 

functional inequality, 52, 54, 131 

input-output, 172, 250 

jerk, 190 

linear ordering, 132, 300 

M-circle constraint, 244 

model reference, 186 

nested family, 132, 177, 300 

nonconvex, 150, 205, 245, 268 

noninferior, 55 

norm- bound, 135 

on a submatrix, 134 

open- loop stability of controller, 268 

overshoot, 13, 173 

Pareto optimal, 55 

PD controller, 14 

peak tracking error, 184, 286 

performance, 171 

realizability, 147 

regulation, 12, 187, 254 

relative overshoot, 173 

rise time, 175 

RMS limit, 12, 281, 332, 335 

robustness, 210, 262 

sensible, 154, 169, 277, 285, 287 

settling time, 175 

slew- rate limit, 181 

stability, 136 

step response, 134, 172 

step response envelope, 177 

time domain, 134 

undershoot, 173 

unstated, 153 

well- posed, 57 
Spectral factorization, 82, 92, 188 
Stability, 147 

augmentation, 3 

closed-loop, 147, 150 

controller, 268 



degree, 138, 166, 232, 238, 263 

for a stable plant, 160 

free parameter representation, 156 

generalized, 165, 232 

historical, 169 

internal, 150 

modified controller paradigm, 157 

robust, 212 

specification, 136 

transfer function, 96, 150, 165 

transfer matrix, 110, 150 

transfer matrix factorization, 169 

using interpolation conditions, 155, 
284 

via Lyapunov function, 245 

via small gain theorem, 117 
Stable 

transfer matrix, 117 
Standard example plant, 41, 103, 150, 
198, 205, 213, 216, 249, 307, 
320, 331, 337, 354 

tradeoff curve, 12, 338, 366 
State-estimator gain, 279 
State-feedback, 277 

gain, 162, 279 
State-space 

computing norms, 119, 125 

controller, 43 

parametrization of stabilizing con- 
trollers, 162 

plant, 43 

small gain method connection, 245 
Step response, 95 

and peak gain, 98 

envelope constraint, 177 

interaction, 178 

overshoot, 48, 173, 355 

overshoot subgradient, 301 

perturbation, 205 

relative overshoot, 173 

sensitivity, 205 

settling time, 250 

slew-rate limit, 181 

specification, 134, 172 

total variation, 98 

undershoot, 173 
Stochastic signal norm, 75 
Stopping criterion, 313, 362 

absolute tolerance, 316 

convex optimization, 315 

relative tolerance, 317 
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Strictly proper, 43 
Structured singular value, 245 
S ub cliff erential 

definition, 293 
Subgradient, 293 

algorithm, 348 

computing, 301 

definition, 293 

H 2 norm, 301 

Hoc norm, 303 

infinite-dimensional, 294 

peak gain, 305 

quasigradient, 296 

step response overshoot, 301 

tools for computing, 299 

worst case norm, 306 
Sub-level set, 131 
Supporting hyperplane, 298 
System 

failure, 45 

identification, 4, 10, 19, 215 

linear time- invariant, 30 

lumped, 29 

T 

Technology, 9 

Theorem of alternatives, 141 

Time domain 

design specification, 134 

weight, 83 
Tolerance, 316 

feasibility, 319 
Total 

energy norm, 80 

fuel norm, 81 

variation, 98 
Tracking 

asymptotic, 173 

bandwidth, 286 

error, 37, 182, 252 

peak error, 184, 286 

RMS error, 183 
Tradeoff 

curve, 12, 57, 290, 338, 366 

interactive search, 63 

norm selection, 119 

rise time versus undershoot, 283 

RMS regulated variables, 13, 280, 
338 

standard example plant, 12, 338, 366 

surface, 57, 281 



tangent to surface, 60 

tracking bandwidth versus error, 286 
Transducer, 18 
Transfer function, 30 

Chebychev norm, 99 

^-stable, 165 

maximum magnitude, 99 

peak gain, 97 

RMS response to white noise, 96 

sensitivity, 196 

shifted, 101 

stability, 96, 150, 165 

uncertainty, 216 

unstable, 96 
Transfer matrix, 30 

closed-loop, 33 

factorization, 169 

singular value plot, 110 

singular values, 110 

size of, 93 

stability, 110, 117, 150 

weight, 89 
Transient, 72, 80, 92 
Trial and error 

controller design, 20 

LQG weight selection, 350 
Tuning, 6 

rules, 5, 18 
Tv(/), 98 
Tweaking, 63 

u 

Unconstrained optimization, 311 
Undershoot, 173 

unstable zero, 283 
Unknown- but-bounded 

model, 70 

signal, 184, 190 
Unstable 

pole-zero cancellation rule, 152, 168 

transfer function, 96 

zero and rise time, 283 

zero and undershoot, 283 

V 

Vector signal 

autocorrelation, 86 
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Wcontr, 121 
W ohs , 120 

Weight 

a smart way to tweak, 332 

constant matrix, 89 

diagonal matrix, 89 

for norm, 88 

frequency domain, 81, 188 

H^, 100 

informal adjustment, 63 

L2 norm, 100 

matrix, 89, 276, 278 

maximum value method, 63 

multivariable, 88 

nominal value method, 59, 63 

selection for LQG controller, 332, 
349 

time domain, 83 

transfer matrix, 89 

tweaking, 63 

tweaking for LQG, 281, 335 

vector, 59 
Weighted-max functional, 62, 131, 139, 

268 
Weighted-sum functional, 59, 131, 139, 

281 
Well-posed 

feedback, 32, 116 

problem, 57 
White noise, 96, 106, 110, 193, 275, 278 
Worst case 

norm, 185 

response norm, 97, 125 

subgradient for norm, 306 
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Zero 

unstable, 283 



